Y-Chromsome and Mitochondrial DNA Haplotypes: Tales of Human Ancestry

by David Warmflash, MD, Nathan H Lents, Ph.D.

How much is your genome (Figure 1) influenced by a great-great-great-great grandparent? The question is tricky since you have 64 of them. In theory, the number of ancestors you have double each generation backward. When you reach 25 or 30 generations into the past, 600-700 years ago, the number of ancestors you should theoretically expect is larger than the entire human population of that time. That is obviously impossible and begins to show how challenging it can be to track ancestry. There’s something wrong with our math.

Human genome
Figure 1: The genome is the entire set of genetic instructions found in a cell. In humans, the genome consists of 23 pairs of chromosomes (found in the nucleus), plus a small chromosome (found in the cells' mitochondria). image © Darryl Leja, NHGRI

In Population Genetics: An Introduction we discussed that genetic drift, gene flow, mutation, and natural selection all are forces of evolution that interact to shape the genomes of populations. But how does one uncover the influence of these forces in sexually reproducing species, such as humans, whose genes gets shuffled each generation? In our module Mendel and Independent Assortment we discussed how genes sort independently as they are passed down from generation to generation. This is due to recombination of segments of DNA that transfer, or crossover, during meiosis between homologous chromosomes (similar, but not identical chromosomes, one inherited from the father and one from the mother). Think of a deck of cards nicely organized with hearts, spades, diamonds, and clubs each grouped together at first. Once shuffled, everything is all mixed up. The same thing happens with genetic recombination and if you attempt to track a certain trait backward more than a couple generations, you’ll have a very difficult time, considering how many ancestors there are to consider.

But not all of our genes get shuffled in the way that Mendel described in his pea plant experiments. In addition to gene shuffling, a Mendelian force, there are non-Mendelian means by which genetic traits can be passed down through generations. Genetic drift (which we emphasized in our Population Genetics: An Introduction module) is one example, but there are other non-Mendelian phenomena. These include haplotypes, genetic sequences that we inherit from only one parent, which allows us to solve our problems with tracking genetic traits and ancestry backward through time and geography.

Comprehension Checkpoint

All genetic material is shuffled in a manner described by Gregor Mendel.

Tracking history with the Y-chromosome

William Sanchez of Albuquerque, New Mexico, was fascinated with genealogy and so had his DNA tested in 2001. Certain customs in his family made him ponder his ancestors who had been colonists in Mexico nearly a half millennium before his birth. He knew they had come from Spain, but there were things that his family did that anthropologists would call ‘crypto-Jewish’. They lit candles on Friday nights, covered mirrors when people died, and did several other things that as a boy Sanchez had assumed were normal for a Spanish-speaking Roman Catholic family in the Southwestern United States.

At age 52, though, he knew better, so when he got a call from a scientist at the genetic testing company, he was pretty sure they were going to tell him that he had some Jewish ancestry. Spain, after all, had expelled its Jewish population in 1492, except for those agreeing to convert to Christianity. Some had converted but had continued their Jewish practices in secret, and many had fled Spain for the Americas. It therefore seemed plausible that some genetic sequences common in Jews might be part of his DNA.

Sanchez was puzzled when the scientist on the phone told him that the testing showed he had genetic sequences of priests. Poetically, it was fitting: Sanchez actually was a priest –a Catholic priest, but it didn’t actually make sense.

"I mean that you’re a Kohane," the scientist explained. "You share ancestry with the Jewish priesthood." The company was telling Sanchez that he had a genetic marker that was patrilineal – passed down from father to son, because it was a Y-chromosome haplotype – and identifiable because it was carried by people with a particular historical role in Jewish culture.

Y-chromosome haplotypes

Haplotypes are genetic sequences that we inherit from only one parent. This is different from autosomal genes, which are genes on a numbered chromosome and usually affect males and females in similar ways. Although a small portion of the Y-chromosome has homology, or similarity of genes at the same place, with a region of the X-chromosome, where crossing over can occur between a dozen or so genes, the rest of the Y-chromosome is unique and non-homologous. It is passed down purely from father to son with no recombination. In this way, it is like a surname in many cultures, or for that matter like the Jewish priesthood: passed down from father to son.

While Y-chromosome haplotypes are inherited as single genetic units without shuffling, they are subject to random mutation, just like the rest of the genome. Harmful mutations are weeded out of the population through natural selections, but many mutations can stick around. For example, natural selection does not eliminate mutations that occur in an intron (a non-coding region of a gene) or in non-coding regions between genes.

The DNA polymerase enzyme is extremely efficient, so random mutation is rare. On average, it happens only once per hundred million base pairs of DNA per generation. Knowing this rate, we can use mutations in haplotypes as a kind of molecular clock. Based on the amount of variation in haplotypes in different people, scientists can produce family trees estimating the common ancestry of those people.

Y-chromosome family tree

In the mid-1990s, the discovery of numerous Y-chromosome haplotypes enabled scientists to start building a Y-chromosome family tree, consisting of various branches all stemming back to a common trunk representing the common ancestor of all Y-chromosome haplotypes (Figure 2). This common ancestor is called Y-MRCA, for "most recent common ancestor." This is sometimes called the Y-chromosome Adam, and it tracks back to 200,000 – 300,000 years ago.

Y haplogroup tree
Figure 2: This family tree traces various branches all stemming back to a common ancestor of all Y-chromosome haplotypes. This common ancestor is called Y-MRCA, for "most recent common ancestor," or Y-chromosome Adam.

It is important to note that, while all living men have a Y-chromosome that is descended from a single male living in the Pleistocene epoch, there were many other men living at that time and many of them also left many descendants. The Y-chromosome Adam is just the man that all living men have in common through our patrilineal ancestry. Plenty of Adam’s contemporaries also left many descendants, just not through a purely patrilineal line.

In 1997, researchers in Haifa, Israel, and Toronto, Canada applied the same principles to test the claims that the Jewish priesthood began with a single man who lived more than 3,000 years ago. Among DNA from Jewish men from many different countries, the researchers identified two sequences, or markers, that were much more common in Jewish priests (Kohanim; singular Kohane) compared with Jews in general. A year later, the team identified four additional markers common in Kohanim, and they designated the six markers as the J1 Cohen Modal haplotype.

The Cohen Modal haplotype is found in both Ashkenazi and Sephardi Jews, the two largest Jewish populations. It also exists among non-Jewish whose ancestry is in the Middle East, but there are great differences in frequencies of the six different markers that comprise the haplotype. Notably, the two markers that the team identified initially (called YAP and DYS19) were found to be present in 55 percent of Kohanim. The rate differed among Ashkenazi and Sephardi populations, the two main Jewish populations; 58 percent of Sephardi Kohanim showed the markers versus 48 percent of their Ashkenazi counterparts. But the rates in non-Kohanic Jews and in non-Jews were found to be much lower, meaning that Kohanim represented a distinct population, stemming from a common Y-chromosomal ancestor.

By comparing the Kohanic markers of the J1 Cohen haplotype to other Y-chromosome haplotypes, researchers were able to place Jewish priests on the Y-chromosome family tree. This allowed them to date the common ancestor for the Kohanic priests of both Ashkenazi and Sephardi Jewish populations to 2,400 to 3,000 years ago (Figure 3).

Kohanim family tree
Figure 3: A haplotype tree diagram of the Kohanim population's Y-chromosomal family tree. image © Chriscohen

Sanchez was told he was a priest because he had the Cohen Modal haplotype, but perhaps more surprising is that the same haplotype was also found among the priests of the Lemba tribe in Zimbabwe, South Africa, Mozambique, and Malawi. The Lemba had always claimed to be part of the Jewish People, but anthropologists had been dismissing the idea for over a century. When molecular geneticists tested the Lemba priestly class called the Buba in the early 2000s, not only did they harbor the Cohen Modal haplotype, they carried it at a rate of 65 percent. They were Kohanim, with a patrilineal line even more pure than that of the Kohanim of Ashkenazi and Sephardi Jews. That these African Jews more closely resemble their fellow Africans than they do their fellow Jews of Europe and the Middle East tells of the extensive intermarriage of the Jewish diaspora with local populations. Despite this, the Jewish blood lines were maintained as revealed by their Y-chromosomes.

Comprehension Checkpoint

_____ are genetic sequences that we inherit from only one parent.

Mitochondrial DNA

The Kohanim are one of many groups whose ancestry can be studied based on the Y-chromosome. The same technique used by the Haifa/Toronto researchers has been applied to track ancestry of various populations of Europe and Asia as well as Native American tribes going back thousands of years. Since the Y-MRCA dates back to at least 200,000, Y-haplotypes also have been used in studying the emergence of our very species.

As useful as Y-chromosome haplotypes are, they track male ancestors only. This has several limitations. For example, Y-chromosome ancestry would give the same result if a whole population of peoples moved from one place to another or if just a small group of mostly males migrated to a new area and conquered and subjugated the population there. This is what happened throughout Latin America during the age of the conquistadors. Even among individuals with almost exclusively indigenous roots, Spanish last names abound and so do European Y-chromosomes. For this and other reasons, examining Y-chromosomes paints only a partial picture of human ancestry.

The genetics of the Jewish diaspora exemplifies this. Ashkenazi Jews share a biological heritage with non-Jewish populations of central and eastern Europe, where the Ashkenazi population emerged. Sephardi Jews, on the other hand, resemble non-Jews of the geographic areas that were host to Sephardi populations (Spain, Portugal, Turkey, Greece), both in appearance and in the frequency of certain genetic diseases. The same is true of Jewish populations based in the Middle East, and the Lemba tribe in southeast Africa. The Lemba have dark skin just like all the other peoples in the area and share other genetic similarities to their compatriots in southeast Africa. The Y-chromosome haplotypes of their priestly class, the Buba, is the exception and shows they have common ancestry with Ashkenazi, Sephardi, and other populations of Jews.

How can this mixed ancestry be sorted out? A possible answer comes from another category of haplotypes, those found in mitochondrial DNA (mtDNA).

mtDNA haplotypes

Mitochondria are organelles found in all eukaryotic cells that assist in cellular respiration and the synthesis of ATP. Owing to their distant past as autonomous bacterial cells, mitochondria have some of their own DNA which harbors genes essential to mitochondrial function (Figure 4). Mitochondria reproduce on their own within your cells, and you have quadrillions of them in your body, but not one of them is from your father. Whether you are male or female, all of your mitochondria come from your mother, your mother’s mother, your mother’s mother’s mother, and on and on, going back to the first eukaryotes some billion and a half years ago.

Mitochondrial DNA
Figure 4: Mitochondria are the “power suppliers” for the cell, generating most of the ATP used in cell processes through the conversion of nutrients into energy. Mitochondria, along with mitochondrial DNA (mtDNA), are passed from mother to offspring. image © NIH

Like the Y-chromosome, mitochondrial DNA exists as single-copy, not paired alleles. Because our maternally-inherited mtDNA is not recombined with other alleles from our fathers, it stays pure along matrilineal ancestry. Only the occasional mutation alters mtDNA haplotypes. Also, like the Y-chromosome, some mutations that occur in mtDNA are not harmful but neutral. These kinds of neutral mutations give us the mtDNA haplotypes, which are useful for tracking matriarchal ancestry.

mtDNA most recent common ancestor

Just as the tree for Y-chromosome haplotypes coalesce with the Y-MRCA, the mtDNA haplotypes coalesce backward in time to a last common mitochondrial ancestor, sometimes called the mitochondrial Eve. We are all descended from so-called mitochondrial Eve in an unbroken line of mothers and daughters (Figure 5). (Remember, though, that we will also be descended from other women living during and even before the time of mitochondrial Eve. We just didn’t get their mitochondrial DNA because we are related to them through at least one male.) Scientists estimate that mitochondrial Eve lived between 100,000 and 200,000 years ago.

Mitochondrial Eve
Figure 5: The Mitochondrial Eve female-lineage. Here the black matrilineal line is descended from mtDNA matrilineal most recent common ancestor (MRCA). image © C. Rottensteiner

Bible stories aside, it is exceedingly unlikely that Y-chromosome Adam and mitochondrial Eve lived at the same time, let alone in the same place, and they were certainly not alone in the world. However, they were real living people that made their way in the African savannah of a bygone era, totally unaware of their impending genetic legacy.

When scientists study populations of a given ethnic group, they generate one family tree based on the Y-chromosomes and another tree based on mtDNA found in that population. When they do this, the trees almost never coincide perfectly. This is because humans have not always moved around in families consisting of a father, a mother, and their children. Polygyny was common in many cultures prior to modern times, and, during both prehistorical and modern times, men have migrated to new places and took local women as mates. By studying both the Y-chromosomes and the mtDNA, we can get both sides of the story.

While the Y-chromosome reveals links between Ashkenazi, Sephardic, Middle Eastern, and Lemba Jewish populations, the matriarchal lineage recorded in mtDNA tells another story. It turns out that the mitochondria of most Ashkenazi Jews have their most recent common ancestry on the Italian peninsula around 2,000 years ago. Thus, the matriarchs of much of the Ashkenazi population were not Jewish. They were non-Jews (pagans or Christians) from the heart of the Roman Empire. Rather than being in conflict with the Y-chromosome studies, mtDNA merely adds a new dimension to the story of Jewish populations.

Piecing the genetic information together with what is known from scraps of historical record, what seems to have happened is that the Ashkenazi population started with Jewish men of Middle Eastern descent who traveled through the Roman Empire, perhaps because they were merchants, and took local wives. The same thing happened in Zimbabwe; Jewish men traveled there in ancient times, bringing their traditions along with their Y-chromosomes. Upon marrying with the Jewish men, the local women added African genes into the gene pool of the resulting mixed population.

Comprehension Checkpoint

_____ were autonomous bacterial cells in their distant past, which is why they have some of their own DNA.

Haplotypes and early humans

The same year that the Haifa and Toronto researchers published their findings on Y-chromosome haplotypes in Kohanim, a mtDNA study was dominating the science news cycle. Applying techniques that were state of the art in the late 1990s, a team led by Swedish geneticist Svante Pääbo of the Max Planck Institute was able to extract mtDNA from bones of a Neanderthal specimen – actually, the original "Neanderthal man," unearthed in 1856 in the Neanderthal Valley in what is now Germany. Certain bone cells can have multiple nuclei, but nuclei throughout bone are vastly outnumbered by mitochondria. Thus, mtDNA is orders of magnitude more abundant than nuclear DNA. Each cell has hundreds or thousands of copies of the same mitochondrial chromosome, so Pääbo and his team knew that if they could get any useful DNA sequences, they’d be coming from mtDNA. That was already a tall order in 1997, whereas getting nuclear DNA from an ancient bone was considered science fiction.

After extracting DNA and amplifying it (making multiple copies of its sequence) through a technique called the polymerase chain reaction (PCR), the team went through a painstaking process of weeding out sequences belonging to different species of bacteria that came from the dirt. This left a tiny percentage of the sample; that was the mtDNA from the Neanderthal individual to whom the bones had belonged.

Neanderthals were a population of humans whose bones and tools indicate that they lived from about 300,000 years ago until about 30,000 years ago when they went extinct. They existed in parallel with our own species, Homo sapiens sapiens (modern humans), and with other species of humans that also suffered extinction. A big question has always been whether Neanderthals and modern humans interbred, or were reproductively isolated from one another. Using the mtDNA extracted and amplified from the Neanderthal specimens, Pääbo and his team were able to do haplotype analysis in comparison with mtDNA from present-day modern humans (Figure 6). The result showed a difference between Neanderthal mtDNA and modern human mtDNA greater than any difference between different branches of the modern human mtDNA family tree. In other words, Neanderthals did not descend from the mitochondrial Eve of modern humans. Instead, Neanderthals had their own mitochondrial Eve, which Pääbo’s analysis showed had lived a lot further into the past than our own mitochondrial Eve. This proved that Neanderthals did not contribute mtDNA to our line.

Neanderthal research
Figure 6: Researchers extracted DNA from neanderthal bones, carefully ensuring it was not contaminated with DNA from any other source (like modern humans). image © NHGRI

As discussed earlier, mtDNA reveals only one dimension of a population’s history, the maternal side. However, in the late 1990s, retrieving nuclear DNA from bones that were more than 50,000 years old was thought to be impossible, and so mtDNA was thought to be all we would ever know of the genetics of our Neanderthal cousins.

This changed with the advent of Next-generation in-solution DNA sequencing, often abbreviated "Next-gen sequencing," which became available just a few years after the mtDNA of Neanderthals was published. With this far more sensitive technique, Pääbo and others were able to amplify and analyze the Y-chromosome haplotypes of Neanderthals and another extinct human subspecies, called the Denisovans. Similar to the Jewish story, the Y-chromosome and mtDNA family trees disagree in comparisons of modern humans with Neanderthals. In contrast to mtDNA, the Y-chromosomal data show instances of gene flow, interbreeding between modern humans and Neanderthals and other extinct human species

We now have now sequenced a complete Neanderthal genome, completing the picture of their genetic ancestry and its relation to our own. It turns out that, while the modern human lineage diverged from the Neanderthal lineage about 500,000 years ago, and remained separate for most of that time, the two populations exchanged genes on at least four occasions. Today, about 8 percent of the non-African human gene pool consists of Neanderthal DNA! Africans, however, do not have any detectable Neanderthal DNA in their genomes, indicating that the admixture took place only after humans had migrated out of Africa and encountered Neanderthals on the Eurasian continent.

Similar to the kohanim priests, the difference between mtDNA and nuclear DNA can be explained with scenarios in which the men migrated and interbred with women from other populations. Many modern humans have some Neanderthal or Denisovan forefathers, but not foremothers. Our mitochondrial line is more "pure," for lack of a better term, compared with the patrilineal lines and that is why scientists in the mid-1990s incorrectly deduced that modern humans had little or no genetic input from Neanderthals.

Comprehension Checkpoint

Analysis of _____ shows that modern humans and neanderthals interbred at some point.

Back to the ancestry math problem

At the beginning of this module, we explained how, if you double the number of ancestors you have in each preceding generation, you only have to go back a few hundred years before, theoretically, the total number of ancestors you have is larger than the population of the earth. This is obviously not possible. The math gets even stranger the further back you go. What is wrong with these calculations?

The answer is that, as you go back in your family tree, ancestors will start to pop up more than once. If you go back two generations to your grandparents, you have four different ancestors. If you go back two more, you almost certainly have sixteen distinct ancestors. However, if you go back eight generations or about 200 years, the math would predict that you have 256 ancestors, but you almost certainly don’t. In fact, you may have closer to 200. This is because the farther back you go, the more "repeat" ancestors you will find. In fact, after 12-15 generations, the total number of your ancestors barely increases each additional generation that you go back.

This is a phenomenon known as “pedigree collapse,” in which distant relatives mate with each other creating a family tree in which an ancestor will appear in multiple places. In the royal families of Europe, the pedigrees collapse in the very recent past. For example, both Queen Elizabeth II and her consort, Prince Philip Mountbatten, are the great-grandchildren of Queen Victoria (Figure 7). In fact, if we want to trace Queen Elizabeth’s line all the way back to William the Conqueror, there are dozens of paths to choose from. But the rest of us can hardly poke fun. Our own family trees collapse quickly enough.

Pedigree collapse
Figure 7: This family tree shows how both Queen Elizabeth II and her husband, Prince Philip Mountbatten, are the great-grandchildren of Queen Victoria

The sum of all this is that we are all related and probably a lot more closely than you think. In fact, scientists estimate that the most recent common ancestor of all humans lived just 2,000-3,000 years ago somewhere in the Levant or Northern Africa. This is different from Y-chromosome Adam or mitochondrial Eve, who gave rise to very specific lineages that then came to comprise the entire population. The most recent common ancestor (MRCA) is just someone who is in everyone’s family tree at least once, even though we can never know what, if any, genetic contribution is his because the DNA we inherited from our two parents becomes shuffled when we pass it down to the next generation.

Nevertheless, every human alive today, from the Buma priests in Zimbabwe to native tribes on remote islands in the Pacific, all share this common ancestor. Of course, many other people living at that time left many, even millions of descendants, but only one of them, MRCA, is common to all of us.

So if anyone ever asks you, "Are you related to her?," there is only one correct answer: "yes."


Using genetic markers passed down through the male or female line, scientists can construct family trees going back thousands of years. This module introduces haplotypes – genetic sequences that we inherit from only one parent. As an example, the module looks at the degree of interbreeding between now-extinct Neanderthals and modern humans as determined through an analysis of Y-chromosome haplotypes (male lineage) and mitochondrial DNA haplotypes (female lineage).

Key Concepts

  • Haplotypes are genetic sequences that we inherit from only one parent. There are two types: Y-chromosomes, inherited from your father, and mitochondrial DNA, inherited from your mother.

  • Y-chromosome haplotypes are subject to random mutation and the discovery of numerous different haplotypes led scientists to construct the Y-chromosome family tree.

  • Mitochondria are organelles in all eukaryotic cells and have some of their own DNA. All of your mitochondria come from your mother and help to build a mtDNA family tree.

  • The most recent common ancestor (MRCA) is someone who exists in everyone's family tree.

  • When scientists study populations of a given ethnic group, they generate one family tree based on the populations' Y-chromosomes and another one based on mtDNA.

  • When constructing family trees, often an ancestor will appear in multiple places - this is a phenomenon known as pedigree collapse.

David Warmflash, MD, Nathan H Lents, Ph.D. “Y-Chromsome and Mitochondrial DNA Haplotypes” Visionlearning Vol. BIO-5 (2), 2017.