About us
U5Description of mtDNA Haplogroup U5
Please click on "mtDNA Results" to see all members of the U5 project organized into known daughter groups or "subclades" of U5. In some cases, subclades can be predicted based on HVR1 and HVR2 results, but in many cases the Full Mitochondrial Genome Sequence (FMS or FGS) test is needed to identify subclades of U5. As of July 8, 2015, including the FTDNA U5 project and GenBank, we have the following totals by subclade:
Subtotals updated June 6, 2015
U5a1: 777
U5a2: 360
U5b1: 497
U5b2: 437
U5b3: 91
The total number of U5 FMS samples is 2162 including two FMS test results that appear to be U5a*, that is, they are not in U5a1 or U5a2 and therefore appear to be new branches that descend from U5a. There have been 314 new samples in the last 11 months. As of July 18, 2014 there were 1828 DMS samples.
The Subclades of U5
Text copyleft by Gail Tonnesen, July 15, 2012, revised July 18, 2014; June 6, 2015. Text may be freely quoted and modified with the condition that the same permission is applied to any derivative work. Please note that there is large uncertainty in age estimates and origins of mtDNA haplogroups. Some of the possible origins discussed below are speculative and will be revised as new data becomes available. This analysis is based on test results in GenBank and in the U5a and U5b Full Sequence projects in Dec, 2013.
Haplogroup U is estimated to have originated in the Near East or Southwest Asia around 50,000 years ago, about 15,000 years after modern humans expanded out of Africa. Haplogroup U appears to have lived during a period of rapid population growth and expansion because it has nine major surviving daughter groups, U1 through U9, which are now found among people who have ancestral origins throughout Europe, Asia, and Africa.
Haplogroup U5 is estimated to be about 30,000 years old, and it is primarily found today in people with European ancestry. Both the current geographic distribution of U5 and testing of ancient human remains indicate that the ancestor of U5 expanded into Europe before 31,000 years ago. A 2013 study by Fu et al. found two U5 individuals at the Dolni Vestonice burial site in the Czech Republic that has been dated to 31,155 years ago. A third person from the same burial was identified as haplogroup U8. The Dolni Vestonice samples have only two of the five mutations ( C16192T and C16270T) that are found in the present day U5 population. This indicates that the U5-(C16192T and C16270T) mtDNA sequence is ancestral to the present day U5 population that includes the additional three mutations T3197C, G9477A and T13617C.
Because there are five additional mutations (T3197C, G9477A, T13617C, C16192T and C16270T) that distinguish present day U5 from U, we can conclude that U5 experienced a long period of very slow population growth or a population bottleneck in Europe. The earliest branching of U5 is its two subclades U5a and U5b that have been dated to about 27,000 years ago by Soares et al., while Behar et al. have a younger estimate of about 22,000 years. U5a is defined by two additional mutations A14793G and C16256T, while U5b is defined by three additional mutations C150T, A7768G and T14182C.
Beginning about 27,000 years ago, the Last Glacial Maximum (LGM) forced U5a and U5b into ice age refugia in southern Europe and perhaps Ukraine and the Near East. U5a has only two known subclades, U5a1 and U5a2, both estimated to be about 20,000 years old. U5b has only three known subclades, U5b1, U5b2 and U5b3, also estimated to be about 20,000 years old. However, age estimates for these subclades from Behar and from Soares vary over a range of 16,000 to 24,000 years. While there is uncertainty in the age estimates of these subclades, it seems likely that a population decline during the LGM is the cause of the lack of ancient diversity or branching in haplogroup U5. It also seems likely that U5a1, U5a2, U5b1, U5b2 and U5b3 were each present in ice age refugia in southern Europe.
As the ice began to retreat about 19,000 years ago, haplogroup U5 was among the first people to repopulate central and northern Europe. We know this because U5 is the dominant haplogroup in ancient remains of early hunter-gatherer populations in Europe, with U5 and its sister group U4 representing about 90% of the earliest Mesolithic hunter-gatherers. The 2013 Fu et al. study found haplogroup U5 in both pre-ice age Paleolithic remains and post-ice age Mesolithic remains, and they conclude: "Because the majority of late Paleolithic and Mesolithic mtDNAs analyzed to date fall on one of the branches of U5, our data provide some support for maternal genetic continuity between the pre- and post-ice age European hunter-gatherers from the time of first settlement to the onset of the Neolithic."
Also beginning around 15,000 years ago we begin to see increasing expansion and diversity in the daughters of U5a1, U5a2, U5b1, U5b2 and U5b3. Each of these has eight or more surviving subclades, and this increase in diversity is consistent with a growing population as U5 expanded from ice age refugia into central and northern Europe. However, U5 was largely replaced by early farmers and other Neolithic immigrants to Europe, and currently U5 represents only about 9% of European mtDNA. Some of the very old subclades of U5 are extremely rare today, perhaps because they represent the remnants of hunter-gatherers who were mostly replaced by Neolithic immigrants.
On the other hand, some U5 subclades are much more common in present populations than others. While we know that U5 was the dominant mtDNA group among early Mesolithic Europeans, it is possible that some U5 subclades might also have been present in early farming or herding populations in the Near East and West Asia, so the present day population of U5 could include a mix of early hunter-gatherers and more recent U5 Neolithic farmer/herder immigrants. Alternatively, certain U5 subclades in southeastern Europe could have adopted farming or been incorporated into farming and herding communities at an early date, perhaps at the beginning of the Neolithic when farmers from the Near East began their expansion into Europe. If certain U5 subclades adopted farming and animal husbandry at an earlier date, their population size could have expanded more rapidly and this could explain their larger distribution today. Testing of ancient remains also shows that U5 was present in the Pontic-Caspian Steppe region, which may have been the home land of Indo-European speakers (for example, see the Kurgan hypothesis). It seems likely that certain subclades of U5 expanded from the Steppe into both Europe and south Asia during the Bronze age migrations that brought Indo-European languages to these regions. One of the challenges, and the goal of this project, is to discover the age and specific migration history of each individual subclade of U5.
Age estimates are shown below in parentheses after subclade names below as years before present (ybp) and are from Behar et al., 2012, (‘‘A Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root). In some cases age estimates based on the U5 project data differ significantly from Behar et al, and in those cases the U5 project estimated dates are presented below. Uncertainty ranges are not shown, but in most cases there is an uncertainty range of several thousand years for the older subclades and usually with less uncertainty for younger subclades. An older estimate for the age of human-chimp common ancestor, published in Science in Aug 2012, could result in somewhat older ages than those estimated below.
In subclade names the "star" symbol is used to indicate test results that have extra mutations for their subclade but are not part of an already named daughter group of that subclade. For example, U5b2* indicates a test result that does not fit into any of the three named subclades U5b2a, U5b2b or U5b2c. Subclade names are from www.phylotree.org. In cases where new subclades have been identified in the U5 project, the * is used in the proposed new subclade name to indicate that the name is tentative, and that the official name could change with the next update of Phylotree.
Geographic origins of subclades are identified below based on the present day geographic distribution of data in the U5 project and GenBank. However, it should be noted that the sample size among FTDNA customers is largest from northwestern Europe, especially Scandinavia, the UK, Ireland, and central Europe, including Germany and Poland. There are also a large number of FMS samples from eastern Europe and Russia from the Malyarchuck et al 2010 study. Sample sizes are probably somewhat smaller in the Mediterranean countries, and much smaller in southwest Asia and India. Estimates of geographic origins may change and new U5 subclades may be discovered when we receive more test results from under sampled regions in the Near East and Asia. It should also be noted that the place of greatest frequency in the present day may not indicate the place of origin of a subclade. Rather, the place with the greatest diversity in test results is a better indicator of the place of origin. For example, U5b1 occurs at very high frequency in Finland, but it has very low diversity there, while U5b1 has low frequency and very high diversity in Spain. This indicates that U5b1 originated in Spain, not Finland. The most reliable indicator of the place of origin will be testing of mtDNA in ancient human remains.
Each of the five branches of U5 (including U5a1, U5a2, U5b1, U5b2 and U5b3) is described in more detail below. Please feel free to email me (gtonnesen at gmail.com with any comments/suggestions/corrections.
Haplogroup U5a1
(last full U5a1 update July 2013; U5a1a1 updated S)
U5a1 has 817 FMS test results with 8 named subclades: U5a1a to U5a1h. It also has 6 currently unnamed U5a1* lineages that are each represented by only one or two samples including 4 from Germany and 2 from Italy. A newly proposed “U5a1* Group I” is also described below.
U5a1a is this defined by an extra mutation at marker 1700, and Behar et al. estimate its age as about 12,000 ybp. There are 454 FMS test results in U5a1a1 (7,500 ybp) and 118 FMS test results in U5a1a2 (10,300 ybp). There are no known test results that are U5a1a*, and this might indicate that U5a1a lived in a community with slow population growth, while its two subclades lived in communities that had begun to grow very rapidly.
U5a1a1 is defined by an extra mutation at marker 1700 and is estimated to be about 7500 years old. U5a1a1 is very large and diverse with nine named subclades (U5a1a1a to U5a1a1i) and about 120 samples that are U5a1a1*, and 20 samples that are U5a1a1 with no extra mutations. Two samples of U5a1a1 have been found in ancient remains dated at about 5100 years from the Yamnaya culture in Samarra, Russia. U5a1a1 and its subclades are found throughout Europe. U5a1a1* has the largest number of samples in the UK, with a smaller number found in Germany, Poland and Scandinavia, a still smaller number found in eastern Europe and 2 samples in Turkey. There are 27 samples of U5a1a1 that also have a mutation at 152, including subclades U5a1a1a and U5a1a1b. These samples are found most frequently in eastern Europe and Russia.
U5a1a1c [updated Mar 2023] is defined by mutations at markers 6905 and 13015, has 36 samples and and age estimate of about 2500 years. U5a1a1c is mostly found in northern Europe with about half the samples from Scandinavia and also from Scotland, England, Germany, Slovenia and Russia.
U5a1a1d [updated Mar 2023] is defined by mutations at markers has an age estimate of about 4000 years. There is an ancient sample from the Western Pontic-Caspian region dated carbon dated at 3940 years. and is generally northern European, mostly the British Isles, Scandinavia (including Iceland), Germany, and a smaller number of samples from Russia, Poland and the Czech Republic.
U5a1a1e [updated Mar 2023] is defined by mutations at markers has an age estimate of about xx years and is .
U5a1a1*Group F [updated Mar 2023] is defined by a mutation at marker 8705. It has 22 samples and an age estimate is very close to that of U5a1a1, about 7500 years. Group F1 is about 3000 years old and is defined by a back mutation at 16270 and a mutation at 16172, and it includes 15 samples from Germany, Bohemia, Hungary, Denmark, Wales, England and Ireland. Group F2 includes 3 samples from Italy and Iran. Group F3 includes samples from Sardinia and Belarus.
U5a1a1g [updated Mar 2023] is defined by mutations at marker 4424 has 18 samples, including Group 1 with six samples form Norway, and other samples from England, Scotland and Lebanon. It's age estimate is very close to that of U5a1a1, about 7500 years.
U5a1a1h [updated Mar 2023] is defined by a mutation at marker 16294 and has an age estimate of about 4000 years. There are 13 samples from northern and eastern Europe: Bulgaria, Poland, Slovakia, Russia, Norway, Denmark and Finland.
U5a1a2 is estimated by Behar et al. to be about 10,300 ybp and it has 2 major subclades. U5a1a2a (3,000 ybp) has 23 samples found mostly in eastern Europe and Russia, but also in Turkey, Iran, Armenia, India and Buryatia (in southeast Siberia). U5a1a2b has 12 samples but only 4 with known ancestry, 3 from the UK and 1 from Sweden. There is also one U5a1a2* sample from Poland.
Given that U5a1a1 has been found in acient remains from the Yamnaya culture, it seems likely that U5a1a1 originated in that culture in the eastern European Steppe region and that U5a1a1 was part of the migration that brought Indo-European languages to Europe. U5a1a1 has expanded very rapidly in diversity beginning about 6000 years ago, and this is also consistent with its expansion in farming and/or herding community thatgrew rapifly after adopting Neolithic technologies. There are no ancient remains of U5a1a2 available but its age, distribution and diversity might also be consistent with an origin in the eastern European Steppe region.
U5a1b has 133 FMS test results and an estimated age of about 9000 years. There are 24 people in U5a1b*, and 17 of these represent distinct lineages that do not yet have subclade names, while two of the samples are U5a1b with no extra mutations. U5a1b also has four named subclades (U5a1b1 to U5a1b4). The very large diversity in U5a1b indicates that it lived in a population that was growing rapidly. The U5a1b* samples are mostly found in northern Europe, with 5 in the UK or Ireland, 4 in Germany, 2 in Russia, 1 Italy, 1 Slovak, and the others of unknown ancestry. The Russian and Slovak samples are part of proposed new "Group 5" subclade.
U5a1b1 is the largest group within U5a1b with 83 members and an age estimate of about 8000 years. U5a1b1* has 29 members, with 10 from the UK, 3 from Poland, 2 from Germany, 2 from Finland and 1 each from Ireland, France, Austria, Switzerland, Italy, Norway, and Russia. U5a1b1 also has six named subclades (U5a1b1a to U5a1b1f). U5a1b1a has 17 members, with 4 from Ireland, 2 from the UK, 2 from Norway, and 1 each from Germany and Italy. U5a1b1b has 7 members with 4 from Russia, 1 from Belarus and 1 from Sweden. U5a1b1c has 12 members include two U5a1b1c* from Russian and England; seven U5a1b1c1 with 3 from Finland and 1 each from Russia, Serbia, Poland and Denmark; and U5a1b1c2 has 3 members with 2 from Russia. U5a1b1d has 8 samples including 2 from the UK, 2 from Poland, 1 each from Germany and Switzerland. U5a1b1 has its greatest frequency and diversity in Northern Europe from Ireland to Russia, and 54% of all U5a1b samples fall within U5a1b1.
U5a1b1e [updated Mar 2023] is defined by a mutation at marker 12582 and has 33 members and an age estimate of about 7000 years. Group 1 is defined by an extra mutation at marker 6260 and and has 20 members from England, Orcadia and Norway, also including Group 1a with members from the Netherlands and Denmark, and Group 1a2 with members from England and Ireland. U5a1b1e Group 2 is defined by a mutation at 16294 and includes 11 members from France, Iceland, Germany and Poland.
U5a1b1f [updated Mar 2023] is defined by a mutations at marker 15596 and 16129. It has only 5 members including four who share another extra mutation at 16111 with ancestry in England and one sample from India. The age estimate is about 5700 years although the uncertainty is high with only five samples.
U5a1b1g [updated Mar 2023] is defined by a mutation at marker 11353 and has 52 members and age estimate of about 2800 years. Members in U5a1b1g* with known ancestry are from England, Austria, Switzerland and Russia. The subgroup U5a1b1g*1 is defined by a mutation at marker 709 includes 11 people with ancestry from Scotland, England and Denmark.
U5a1b1h [updated Mar 2023] is defined by a mutation at marker 10754 and has 27 members and age estimate of about 2200 years. Most members are from Finland and Sweden with just a few samples from Germany, Denmark and Estonia.
U5a1b2 has only 3 members with 2 Czechs and 1 from Finland.
U5a1b3 has 19 members, including 8 in "Group A" with 5 from Finland and 1 from England; 2 from Scotland, 2 from Ireland and 1 each from Germany, Ukraine and Russia.
U5a1b4 has 4 members including 1 from England and 1 from Ireland.
To summarize, U5a1b is found most often in northern Europe including the UK, Ireland, Scandinavia, Germany, Poland and Russia, with a smaller number of test results in other parts of Europe. It is very diverse with 4 named subclades and 17 unnamed for a total of 21 unique lineages. More than half of U5a1b are in a single subclade U5a1b1 with each of the remaining 20 groups having a small number of members or a single member. The majority of the U5a1b* samples are from northwestern Europe but that could be a result of greater sampling from that region. The geographic distribution of U5a1b1 seems similar to other members of U5a1b. This raises the interesting question: Why is U5a1b1 so much larger than the other 20 U5a1b lineages?
U5a1c has 46 FMS test results and an estimated age of about 15,000 years. U5a1c1* (5,000 ybp) has 8 samples including 3 from Russia and one each from Poland, Czech, and Slovakia and Sweden. U5a1c1a has 9 samples from France, England, Scotland and Hungary. U5a1c2* (13,000 ybp) has a single sample from Russia, and U5a1c2a (6,000 ybp) has 23 samples from mostly from Denmark but also from Sweden, Ireland, Estonia and Poland. There are also four U5a1c* samples including U5a1c*3 from Germany & Italy and U5a1c*4 from France & Germany.
U5a1d has 42 test results and an estimated age of about 15,000 years. There is one U5a1d* from France and one from ancient remains at Samara, Russia dated at 7600 ybp). There are seven U5a1d1* (8,000 ybp) samples from Ireland, Poland, France, Germany, Ukraine and Russia. There are 17 U5a1d2a* (5,000 ybp) test results from Norway, Denmark, Sweden, Germany, Ireland France and Spain, and 6 U5a1d2a1 (3,000 ybp) with one each from Sweden, Finland, Russia, Belorus and Buryat. There are 10 U5a1d2b (7,000 ybp) test results including two from the Altai region, two from Norway, one each Tatar, Ukraine, Russia, Hungary, Finland, and Iran. U5a1d2b is especially interesting because it has a characteristic mutation at 16304 and may have been found in several ancient remains from northeastern Europe to central Asia. 16304 is a frequent mutation site, so coding region test results are needed to confirm if the ancient remains are in fact U5a1d2b, but the distribution of present day U5a1d2b samples is similar to that of the ancient remains, so it seems likely they are U5a1d2b. (updated 02/23/15)
U5a1e includes only 4 test results with ancestry in Germany, Poland and Russia.
U5a1f has 10 test results and is estimated to be about 13,000 years old. There is one person in U5a1f1* from Switzerland; two people in U5a1f1a* from Hungary and an Adygei from the Caucus Mountains; and four people in U5a1f1a1 from Germany England and Norway. There are two unnamed groups in U5a1f*, one of which includes two people from Russia, and the other includes one person from Georgia and one from northern Europe. Perhaps U5a1f had its origins in ice age refuge in the Ukraine.
U5a1g has 8 test results and is estimated to be about 9,000 years old. It has been found mostly in southeastern Europe with 2 people from Slovakia and one each from Italy, Macedonia, Armenia and England.
U5a1h [updated October 2023]
Please click on "mtDNA Results" to see all members of the U5 project organized into known daughter groups or "subclades" of U5. In some cases, subclades can be predicted based on HVR1 and HVR2 results, but in many cases the Full Mitochondrial Genome Sequence (FMS or FGS) test is needed to identify subclades of U5. As of July 8, 2015, including the FTDNA U5 project and GenBank, we have the following totals by subclade:
Subtotals updated June 6, 2015
U5a1: 777
U5a2: 360
U5b1: 497
U5b2: 437
U5b3: 91
Subtotals updated March 2023
U5a1: 1,559
U5a2: 827
U5b1: 1,107
U5b2: 807
U5b3: 135
Note: These totals do not yet include all of the people who joined the project in the last few years
U5a1: 1,559
U5a2: 827
U5b1: 1,107
U5b2: 807
U5b3: 135
Note: These totals do not yet include all of the people who joined the project in the last few years
The total number of U5 FMS samples is 2162 including two FMS test results that appear to be U5a*, that is, they are not in U5a1 or U5a2 and therefore appear to be new branches that descend from U5a. There have been 314 new samples in the last 11 months. As of July 18, 2014 there were 1828 DMS samples.
The Subclades of U5
Text copyleft by Gail Tonnesen, July 15, 2012, revised July 18, 2014; June 6, 2015. Text may be freely quoted and modified with the condition that the same permission is applied to any derivative work. Please note that there is large uncertainty in age estimates and origins of mtDNA haplogroups. Some of the possible origins discussed below are speculative and will be revised as new data becomes available. This analysis is based on test results in GenBank and in the U5a and U5b Full Sequence projects in Dec, 2013.
Haplogroup U is estimated to have originated in the Near East or Southwest Asia around 50,000 years ago, about 15,000 years after modern humans expanded out of Africa. Haplogroup U appears to have lived during a period of rapid population growth and expansion because it has nine major surviving daughter groups, U1 through U9, which are now found among people who have ancestral origins throughout Europe, Asia, and Africa.
Haplogroup U5 is estimated to be about 30,000 years old, and it is primarily found today in people with European ancestry. Both the current geographic distribution of U5 and testing of ancient human remains indicate that the ancestor of U5 expanded into Europe before 31,000 years ago. A 2013 study by Fu et al. found two U5 individuals at the Dolni Vestonice burial site in the Czech Republic that has been dated to 31,155 years ago. A third person from the same burial was identified as haplogroup U8. The Dolni Vestonice samples have only two of the five mutations ( C16192T and C16270T) that are found in the present day U5 population. This indicates that the U5-(C16192T and C16270T) mtDNA sequence is ancestral to the present day U5 population that includes the additional three mutations T3197C, G9477A and T13617C.
Because there are five additional mutations (T3197C, G9477A, T13617C, C16192T and C16270T) that distinguish present day U5 from U, we can conclude that U5 experienced a long period of very slow population growth or a population bottleneck in Europe. The earliest branching of U5 is its two subclades U5a and U5b that have been dated to about 27,000 years ago by Soares et al., while Behar et al. have a younger estimate of about 22,000 years. U5a is defined by two additional mutations A14793G and C16256T, while U5b is defined by three additional mutations C150T, A7768G and T14182C.
Beginning about 27,000 years ago, the Last Glacial Maximum (LGM) forced U5a and U5b into ice age refugia in southern Europe and perhaps Ukraine and the Near East. U5a has only two known subclades, U5a1 and U5a2, both estimated to be about 20,000 years old. U5b has only three known subclades, U5b1, U5b2 and U5b3, also estimated to be about 20,000 years old. However, age estimates for these subclades from Behar and from Soares vary over a range of 16,000 to 24,000 years. While there is uncertainty in the age estimates of these subclades, it seems likely that a population decline during the LGM is the cause of the lack of ancient diversity or branching in haplogroup U5. It also seems likely that U5a1, U5a2, U5b1, U5b2 and U5b3 were each present in ice age refugia in southern Europe.
As the ice began to retreat about 19,000 years ago, haplogroup U5 was among the first people to repopulate central and northern Europe. We know this because U5 is the dominant haplogroup in ancient remains of early hunter-gatherer populations in Europe, with U5 and its sister group U4 representing about 90% of the earliest Mesolithic hunter-gatherers. The 2013 Fu et al. study found haplogroup U5 in both pre-ice age Paleolithic remains and post-ice age Mesolithic remains, and they conclude: "Because the majority of late Paleolithic and Mesolithic mtDNAs analyzed to date fall on one of the branches of U5, our data provide some support for maternal genetic continuity between the pre- and post-ice age European hunter-gatherers from the time of first settlement to the onset of the Neolithic."
Also beginning around 15,000 years ago we begin to see increasing expansion and diversity in the daughters of U5a1, U5a2, U5b1, U5b2 and U5b3. Each of these has eight or more surviving subclades, and this increase in diversity is consistent with a growing population as U5 expanded from ice age refugia into central and northern Europe. However, U5 was largely replaced by early farmers and other Neolithic immigrants to Europe, and currently U5 represents only about 9% of European mtDNA. Some of the very old subclades of U5 are extremely rare today, perhaps because they represent the remnants of hunter-gatherers who were mostly replaced by Neolithic immigrants.
On the other hand, some U5 subclades are much more common in present populations than others. While we know that U5 was the dominant mtDNA group among early Mesolithic Europeans, it is possible that some U5 subclades might also have been present in early farming or herding populations in the Near East and West Asia, so the present day population of U5 could include a mix of early hunter-gatherers and more recent U5 Neolithic farmer/herder immigrants. Alternatively, certain U5 subclades in southeastern Europe could have adopted farming or been incorporated into farming and herding communities at an early date, perhaps at the beginning of the Neolithic when farmers from the Near East began their expansion into Europe. If certain U5 subclades adopted farming and animal husbandry at an earlier date, their population size could have expanded more rapidly and this could explain their larger distribution today. Testing of ancient remains also shows that U5 was present in the Pontic-Caspian Steppe region, which may have been the home land of Indo-European speakers (for example, see the Kurgan hypothesis). It seems likely that certain subclades of U5 expanded from the Steppe into both Europe and south Asia during the Bronze age migrations that brought Indo-European languages to these regions. One of the challenges, and the goal of this project, is to discover the age and specific migration history of each individual subclade of U5.
Age estimates are shown below in parentheses after subclade names below as years before present (ybp) and are from Behar et al., 2012, (‘‘A Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root). In some cases age estimates based on the U5 project data differ significantly from Behar et al, and in those cases the U5 project estimated dates are presented below. Uncertainty ranges are not shown, but in most cases there is an uncertainty range of several thousand years for the older subclades and usually with less uncertainty for younger subclades. An older estimate for the age of human-chimp common ancestor, published in Science in Aug 2012, could result in somewhat older ages than those estimated below.
In subclade names the "star" symbol is used to indicate test results that have extra mutations for their subclade but are not part of an already named daughter group of that subclade. For example, U5b2* indicates a test result that does not fit into any of the three named subclades U5b2a, U5b2b or U5b2c. Subclade names are from www.phylotree.org. In cases where new subclades have been identified in the U5 project, the * is used in the proposed new subclade name to indicate that the name is tentative, and that the official name could change with the next update of Phylotree.
Geographic origins of subclades are identified below based on the present day geographic distribution of data in the U5 project and GenBank. However, it should be noted that the sample size among FTDNA customers is largest from northwestern Europe, especially Scandinavia, the UK, Ireland, and central Europe, including Germany and Poland. There are also a large number of FMS samples from eastern Europe and Russia from the Malyarchuck et al 2010 study. Sample sizes are probably somewhat smaller in the Mediterranean countries, and much smaller in southwest Asia and India. Estimates of geographic origins may change and new U5 subclades may be discovered when we receive more test results from under sampled regions in the Near East and Asia. It should also be noted that the place of greatest frequency in the present day may not indicate the place of origin of a subclade. Rather, the place with the greatest diversity in test results is a better indicator of the place of origin. For example, U5b1 occurs at very high frequency in Finland, but it has very low diversity there, while U5b1 has low frequency and very high diversity in Spain. This indicates that U5b1 originated in Spain, not Finland. The most reliable indicator of the place of origin will be testing of mtDNA in ancient human remains.
Each of the five branches of U5 (including U5a1, U5a2, U5b1, U5b2 and U5b3) is described in more detail below. Please feel free to email me (gtonnesen at gmail.com with any comments/suggestions/corrections.
Haplogroup U5a1
(last full U5a1 update July 2013; U5a1a1 updated S)
U5a1 has 817 FMS test results with 8 named subclades: U5a1a to U5a1h. It also has 6 currently unnamed U5a1* lineages that are each represented by only one or two samples including 4 from Germany and 2 from Italy. A newly proposed “U5a1* Group I” is also described below.
U5a1a is this defined by an extra mutation at marker 1700, and Behar et al. estimate its age as about 12,000 ybp. There are 454 FMS test results in U5a1a1 (7,500 ybp) and 118 FMS test results in U5a1a2 (10,300 ybp). There are no known test results that are U5a1a*, and this might indicate that U5a1a lived in a community with slow population growth, while its two subclades lived in communities that had begun to grow very rapidly.
U5a1a1 is defined by an extra mutation at marker 1700 and is estimated to be about 7500 years old. U5a1a1 is very large and diverse with nine named subclades (U5a1a1a to U5a1a1i) and about 120 samples that are U5a1a1*, and 20 samples that are U5a1a1 with no extra mutations. Two samples of U5a1a1 have been found in ancient remains dated at about 5100 years from the Yamnaya culture in Samarra, Russia. U5a1a1 and its subclades are found throughout Europe. U5a1a1* has the largest number of samples in the UK, with a smaller number found in Germany, Poland and Scandinavia, a still smaller number found in eastern Europe and 2 samples in Turkey. There are 27 samples of U5a1a1 that also have a mutation at 152, including subclades U5a1a1a and U5a1a1b. These samples are found most frequently in eastern Europe and Russia.
U5a1a1c [updated Mar 2023] is defined by mutations at markers 6905 and 13015, has 36 samples and and age estimate of about 2500 years. U5a1a1c is mostly found in northern Europe with about half the samples from Scandinavia and also from Scotland, England, Germany, Slovenia and Russia.
U5a1a1d [updated Mar 2023] is defined by mutations at markers has an age estimate of about 4000 years. There is an ancient sample from the Western Pontic-Caspian region dated carbon dated at 3940 years. and is generally northern European, mostly the British Isles, Scandinavia (including Iceland), Germany, and a smaller number of samples from Russia, Poland and the Czech Republic.
U5a1a1e [updated Mar 2023] is defined by mutations at markers has an age estimate of about xx years and is .
U5a1a1*Group F [updated Mar 2023] is defined by a mutation at marker 8705. It has 22 samples and an age estimate is very close to that of U5a1a1, about 7500 years. Group F1 is about 3000 years old and is defined by a back mutation at 16270 and a mutation at 16172, and it includes 15 samples from Germany, Bohemia, Hungary, Denmark, Wales, England and Ireland. Group F2 includes 3 samples from Italy and Iran. Group F3 includes samples from Sardinia and Belarus.
U5a1a1g [updated Mar 2023] is defined by mutations at marker 4424 has 18 samples, including Group 1 with six samples form Norway, and other samples from England, Scotland and Lebanon. It's age estimate is very close to that of U5a1a1, about 7500 years.
U5a1a1h [updated Mar 2023] is defined by a mutation at marker 16294 and has an age estimate of about 4000 years. There are 13 samples from northern and eastern Europe: Bulgaria, Poland, Slovakia, Russia, Norway, Denmark and Finland.
U5a1a2 is estimated by Behar et al. to be about 10,300 ybp and it has 2 major subclades. U5a1a2a (3,000 ybp) has 23 samples found mostly in eastern Europe and Russia, but also in Turkey, Iran, Armenia, India and Buryatia (in southeast Siberia). U5a1a2b has 12 samples but only 4 with known ancestry, 3 from the UK and 1 from Sweden. There is also one U5a1a2* sample from Poland.
Given that U5a1a1 has been found in acient remains from the Yamnaya culture, it seems likely that U5a1a1 originated in that culture in the eastern European Steppe region and that U5a1a1 was part of the migration that brought Indo-European languages to Europe. U5a1a1 has expanded very rapidly in diversity beginning about 6000 years ago, and this is also consistent with its expansion in farming and/or herding community thatgrew rapifly after adopting Neolithic technologies. There are no ancient remains of U5a1a2 available but its age, distribution and diversity might also be consistent with an origin in the eastern European Steppe region.
U5a1b has 133 FMS test results and an estimated age of about 9000 years. There are 24 people in U5a1b*, and 17 of these represent distinct lineages that do not yet have subclade names, while two of the samples are U5a1b with no extra mutations. U5a1b also has four named subclades (U5a1b1 to U5a1b4). The very large diversity in U5a1b indicates that it lived in a population that was growing rapidly. The U5a1b* samples are mostly found in northern Europe, with 5 in the UK or Ireland, 4 in Germany, 2 in Russia, 1 Italy, 1 Slovak, and the others of unknown ancestry. The Russian and Slovak samples are part of proposed new "Group 5" subclade.
U5a1b1 is the largest group within U5a1b with 83 members and an age estimate of about 8000 years. U5a1b1* has 29 members, with 10 from the UK, 3 from Poland, 2 from Germany, 2 from Finland and 1 each from Ireland, France, Austria, Switzerland, Italy, Norway, and Russia. U5a1b1 also has six named subclades (U5a1b1a to U5a1b1f). U5a1b1a has 17 members, with 4 from Ireland, 2 from the UK, 2 from Norway, and 1 each from Germany and Italy. U5a1b1b has 7 members with 4 from Russia, 1 from Belarus and 1 from Sweden. U5a1b1c has 12 members include two U5a1b1c* from Russian and England; seven U5a1b1c1 with 3 from Finland and 1 each from Russia, Serbia, Poland and Denmark; and U5a1b1c2 has 3 members with 2 from Russia. U5a1b1d has 8 samples including 2 from the UK, 2 from Poland, 1 each from Germany and Switzerland. U5a1b1 has its greatest frequency and diversity in Northern Europe from Ireland to Russia, and 54% of all U5a1b samples fall within U5a1b1.
U5a1b1e [updated Mar 2023] is defined by a mutation at marker 12582 and has 33 members and an age estimate of about 7000 years. Group 1 is defined by an extra mutation at marker 6260 and and has 20 members from England, Orcadia and Norway, also including Group 1a with members from the Netherlands and Denmark, and Group 1a2 with members from England and Ireland. U5a1b1e Group 2 is defined by a mutation at 16294 and includes 11 members from France, Iceland, Germany and Poland.
U5a1b1f [updated Mar 2023] is defined by a mutations at marker 15596 and 16129. It has only 5 members including four who share another extra mutation at 16111 with ancestry in England and one sample from India. The age estimate is about 5700 years although the uncertainty is high with only five samples.
U5a1b1g [updated Mar 2023] is defined by a mutation at marker 11353 and has 52 members and age estimate of about 2800 years. Members in U5a1b1g* with known ancestry are from England, Austria, Switzerland and Russia. The subgroup U5a1b1g*1 is defined by a mutation at marker 709 includes 11 people with ancestry from Scotland, England and Denmark.
U5a1b1h [updated Mar 2023] is defined by a mutation at marker 10754 and has 27 members and age estimate of about 2200 years. Most members are from Finland and Sweden with just a few samples from Germany, Denmark and Estonia.
U5a1b2 has only 3 members with 2 Czechs and 1 from Finland.
U5a1b3 has 19 members, including 8 in "Group A" with 5 from Finland and 1 from England; 2 from Scotland, 2 from Ireland and 1 each from Germany, Ukraine and Russia.
U5a1b4 has 4 members including 1 from England and 1 from Ireland.
To summarize, U5a1b is found most often in northern Europe including the UK, Ireland, Scandinavia, Germany, Poland and Russia, with a smaller number of test results in other parts of Europe. It is very diverse with 4 named subclades and 17 unnamed for a total of 21 unique lineages. More than half of U5a1b are in a single subclade U5a1b1 with each of the remaining 20 groups having a small number of members or a single member. The majority of the U5a1b* samples are from northwestern Europe but that could be a result of greater sampling from that region. The geographic distribution of U5a1b1 seems similar to other members of U5a1b. This raises the interesting question: Why is U5a1b1 so much larger than the other 20 U5a1b lineages?
U5a1c has 46 FMS test results and an estimated age of about 15,000 years. U5a1c1* (5,000 ybp) has 8 samples including 3 from Russia and one each from Poland, Czech, and Slovakia and Sweden. U5a1c1a has 9 samples from France, England, Scotland and Hungary. U5a1c2* (13,000 ybp) has a single sample from Russia, and U5a1c2a (6,000 ybp) has 23 samples from mostly from Denmark but also from Sweden, Ireland, Estonia and Poland. There are also four U5a1c* samples including U5a1c*3 from Germany & Italy and U5a1c*4 from France & Germany.
U5a1d has 42 test results and an estimated age of about 15,000 years. There is one U5a1d* from France and one from ancient remains at Samara, Russia dated at 7600 ybp). There are seven U5a1d1* (8,000 ybp) samples from Ireland, Poland, France, Germany, Ukraine and Russia. There are 17 U5a1d2a* (5,000 ybp) test results from Norway, Denmark, Sweden, Germany, Ireland France and Spain, and 6 U5a1d2a1 (3,000 ybp) with one each from Sweden, Finland, Russia, Belorus and Buryat. There are 10 U5a1d2b (7,000 ybp) test results including two from the Altai region, two from Norway, one each Tatar, Ukraine, Russia, Hungary, Finland, and Iran. U5a1d2b is especially interesting because it has a characteristic mutation at 16304 and may have been found in several ancient remains from northeastern Europe to central Asia. 16304 is a frequent mutation site, so coding region test results are needed to confirm if the ancient remains are in fact U5a1d2b, but the distribution of present day U5a1d2b samples is similar to that of the ancient remains, so it seems likely they are U5a1d2b. (updated 02/23/15)
U5a1e includes only 4 test results with ancestry in Germany, Poland and Russia.
U5a1f has 10 test results and is estimated to be about 13,000 years old. There is one person in U5a1f1* from Switzerland; two people in U5a1f1a* from Hungary and an Adygei from the Caucus Mountains; and four people in U5a1f1a1 from Germany England and Norway. There are two unnamed groups in U5a1f*, one of which includes two people from Russia, and the other includes one person from Georgia and one from northern Europe. Perhaps U5a1f had its origins in ice age refuge in the Ukraine.
U5a1g has 8 test results and is estimated to be about 9,000 years old. It has been found mostly in southeastern Europe with 2 people from Slovakia and one each from Italy, Macedonia, Armenia and England.
U5a1h [updated October 2023]
U5a1h is another rare subclade with 53 samples, most with ancestry from Scotland, Ireland, England and the UK, with a few samples from Denmark and Germany. U5a1h is estimated to share a common maternal ancestor about 3,000 years old, but it diverged from other U5a1 samples around 20,000 years ago. It is interesting that U5a1h is defined by a set of 9 mutations and it has no sister groups or branches in its tree since U5a1 some 20,000 years ago. It is likely that the maternal ancestor of U5a1h was among early hunter-gatherers that repopulated northern Europe after the last ice age. It may have been among the earliest people to populate the British Isles and then largely replaced by later immigrants. It is very unusual to have no branches in the tree for a period of 17,000 years. However, a recent research study reported a sample in the Mansi people of western Siberia that has 5 of the 9 mutations that define U5a1h, so this presents a sister group that diverged from the main U5a1h lineage around 8000 years ago. This seems with the consistent with the theory of U5a1h originating in northern European hunter-gatherers during the Mesolithic period.
U5a1* Group I is a newly discovered and fairly rare branch of U5a1. We only have 9 test results and Group I is estimated to be about 10,000 years old. Seven of these people share 2 extra mutations and this Group I1 is estimated to be about 5,000 years old. U5a1* Group I is rather unusual in that it has a mix of 6 people with north European ancestry (Norway, Germany, UK) and 2 people with south Asian ancestry. My guess is that Group I was present among early hunter-gatherers who repopulated Europe after the last glacial maximum, about 10,000 years ago. We will need more test results to determine if the south Asian branches of Group I reflect ancient or recent migrations from Europe to south Asia.
Summary of U5a1: While the precise age and geographic origin of U5a1 remains uncertain, we know that U5a1 lived during a time of more rapid population expansion because it has 14 known daughter lineages, including nine named subclades and five lineages not yet named. The greatest diversity of U5a1 seems to be in central and northern Europe (note that the five very rare unnamed U5a1* lineages have been found in Italy, Tyrol, Germany and Poland). The two dominant subclades U5a1a and U5a1b represent 70% of all U5a1 samples, while the other U5a1 subclades are found much less frequently. This suggests that U5a1a and U5a1b might have been present in populations that began to grow rapidly perhaps around the beginning of the Neolithic period. The presence of U5a1a1 in Ancient Yamnaya culture remains support the theory that U5a1a1 originated in the eastern European steppe and arrived in western Europe with the migrations that brought Indo-European languages to western Europe. Other U5a1 subclades might represent remnants of hunter-gather populations that adopted Neolithic farming and herding practices at a later date. Some U5a1 samples, including U5a1d2b and "U5a1* Group I" have been found in central Asian (including ancient remains) and India, and these samples probably represent early migrations of U5a1 populations from Europe into central Asia. It is likely that additional very rare subclades of U5a1 still remain to be discovered, and additional testing of present day populations and ancient remains will lead to a more complete description of the history of U5a1 in Europe.
Haplogroup U5a2
(update in progress - Nov 1, 2014)
U5a2 has been estimated to be around 20,000 years old and it has 316 FMS test results. U5a2 has five named subclades (U5a2a to U5a2e). It also has three unnamed subclades, U5a2* Group F with ancestry in France and Moldova, U5a2* Group G with ancestry in Italy, and U5a2* Group H with ancestry in England.
U5a2a is estimated to be about 12,000 years old. It has 117 FMS test results, but 104 of these are in a single subclade U5a2a1 estimated to be about 6,000 years old. The 13 U5a2a* samples represent three different subclades of U5a2a that are found mostly in northern Europe. U5a2a has a distinctive HVR1 signature, and 2 sets of ancient remains have been identified as U5a2a based on HVR1 test results: remains from Hohlenstein-Stadel, Germany dated to 8,700 years ago, and another set of remains from Damsbo, Denmark dated to 4,200 years ago. The Hohlenstein-Stadel sample appears to be a close match to one of the U5a2a* members of the U5 project. U5a2a is interesting because 97% of its samples are in U5a2a1, only one of the four surviving U5a2a lineages. One possible interpretation is that U5a2a1 originated among eastern European hunter gatherers and in the forest steppe region, where it underwent rapid population expansion beginning about 6,000 years ago, and then expanded into western Europe with the migration of Indo-European speakers. It is possible that the remaining hunter-gatherer U5a2a lineages in western Europe were mostly replaced by Neolithic and Bronze age immigrants (including U5a2a1), and therefore other subclades of U5a2a are found at very low frequency in northern Europe today.
U5a2a1 is a very diverse group with 13 named or proposed subclades, There are an additional 18 U5a2a1* lineages that are represented by a single sample and 9 samples that are U5a2a1 with no additional mutations. U5a2a1* is found throughout northern Europe, with 13 from Finland, 11 from Russia, 5 from Germany, 4 from the UK, 3 from Poland, 2 from Ireland, 3 from Sweden, and one each from France, Spain, Switzerland, Belorus, Ukraine, India and one Korak from far eastern Russia. Among its named subclades, U5a2a1a and U5a2a1e are found in Finland, U5a2a1b is found in Russia and Ukraine, U5a2a1c is found in Russia and Belorus, and U5a2a1d is found in Spain, England, Wales and France. U5a2a1 Group F is found in Ireland , Russia and a Koryak from far eastern Russia. U5a2a1 Group G is found in Germany and Spain. U5a2a1 Group H is found in Denmark, Sweden and Scotland. U5a2a1 Group I is found in Poland and Russia. U5a2a1 Group J is found in Sweden, Finland, Russia, Germany and England. U5a2a1 Group K is found in Ireland. U5a2a1 Group L is found in the UK. U5a2a1 Group M is found in Russia and Belarus.
U5a2a* Group 3 has 4 samples from Denmark, Sweden and Russia and has an age estimate of about 3000 years.
U5a2a* Group 4 has 1 sample from a person of European ancestry.
U5a2b has 52 FMS test results and is estimated to be about 12,000 years old. It has 4 named subclades and also nine U5a2b* test results that represent 7 different un-named lineages, 4 of which have ancestry in Germany, Italy, Russia and Tunisia. U5a2b1 has 22 test results including 11 in U5a2b1* of which two are from Germany and two from Russia, and one each from Portugal, Norway, Poland, Czech and Ukraine. There are 3 people in U5a2b1a* with ancestry in France, Sicily and Belarus and there are 4 people in U5a2b1a* Group 1 with two from Russia and one each from Germany and Poland. There are also 4 people in U5a2b1b with 2 from Germany and 1 from Switzerland. There are 7 people in U5a2b2 with one each from Belarus, Slovakia, Poland, Ukraine and the Italian Alps. There are 7 people in U5a2b3 with two each from England and Finland and one each from Italy and Germany. There are 7 people in U5a2b4 with two one each from Ireland and Norway.
U5a2c has 111 FMS test results and is estimated to be about 12,000 years old. There are eleven U5a2c* test results, from France, Italy, Ireland and Spain. There are also two ancient U5a2c samples from Denmark and Germany dated at about 10,000 years ago. U5a2c has four named subclades. U5a2c1 (3800 ybp) has 29 test results mostly from northern Europe (Germany, Denmark, Sweden, Ireland, Scotland, England) and two samples from Spain and one from Tunisia. U5a2c2 has only 3 test results from Italy and Finland. U5a2c3 has an ancient sample from Germany dated at 10,600 years ago and 51 samples including 42 in U5a2c3a from northern Europe and nine U5a2c3*b samples from England. U5a2c3a also has two ancient samples from England and France dated at about 4200 years ago. There are also 18 U5a2c4 test results from northern Europe. Given that U5a2c and its subclades are mostly found in northern and western Europe, including ancient Mesolithic hunter-gather samples, my guess is that it originated in western/central Europe after the last glacial maximum when hunter-gatherers expanded into northern Europe. Given the low frequency in Europe today and the lack of U5a2c in eastern Europe and Asia, it seems likely that U5a2c hunter-gatherers were mostly replaced by Neolithic and Bronze Age migrations into Europe. [updated May 2021]
U5a2e has 7 test results and an age estimate of 10,000 years. There is one U5a2e* from Finland, one U5a2e1* who is Czech, and five U5a2e1 with two Czechs and one each from Austria, Slovenia and Belorus. The sample size is small but we see a possible connection here between Finns and southeastern Europe, also as discussed for U5b1b1a.
Summary of U5a2: U5a2 is found much less frequently than U5a1, but U5a2 also lived during a time of more rapid population expansion because it has 7 known daughter lineages, including five named subclades and two lineages not yet named. As in the case of U5a1, the majority of U5a2 samples (69%) are in its two largest subclades, U5a2a and U5a2b, and 37% of all U5a2 samples are in U5a2a1 which is dated to about 6000 ybp, suggesting that U5a2a1 lived in a Neolithic population that expanded very rapidly. U5a2 is found most frequently in northern and eastern Europe, including Russia. It is possible that U5a2 was present in multiple ice age refugia. Some of the less common subclades of U5a2 are found primarily in western Europe and may have been present in an ice age refuge in western Europe, while U5a2a and U5a2b are found more frequently in the northern regions of central and eastern Europe, and perhaps were present in an ice age refuge in the Balkans or Italy. From ancient remains we know that U5a2a was already present in Germany 8700 ybp. Another possibility is that U5a2a was present in a southern European ice age refuge, and initially expanded into central and northern Europe as the ice retreated, and then expanded into eastern Europe and Russia. The fact that U5a2 is found infrequently in southern Europe suggests that it was not present in early Neolithic farming communities that expanded from the Near East into Europe. If U5a2a1 was not present among early farmers, perhaps its high frequency in northern Europe today and its rapid expansion 6000 years ago might suggest that U5a2a1 was present in early Neolithic herding communities in eastern and northern Europe? More testing of ancient remains will be needed to better understand the migration history of U5a2.
Haplogroup U5b1
(Last updated April 2013)
U5b1 has 232 FMS samples with 6 named subclades (U5b1a to U5b1f) and there are more than 20 additional U5b1* FMS test results that do not belong to any of the named subclades. These 20 test results represent 15 additional distinct daughters of U5b1, thus, U5b1 has by far the greatest diversity of the five major U5 subclades. A large number of these U5b1* samples have been found in Spain which suggests a possible Iberian origin for U5b1. Single samples of U5b1* test results have also been found in Scotland, England, Ireland, the UK, Germany, Croatia and Belorus. What can we conclude about the age and origins of U5b1? There remains uncertainty in U5b1 age estimates, in the range of 16,000 to 24,000 years, and it is challenging to infer ancient origins from current population distributions. It is possible that U5b1 was widespread in Europe before the last glacial maximum and that it retreated to ice age refugia throughout southern Europe. This would explain why some subclades of U5b1 seem to originate in Iberia, while U5b1c seems to originate in Italy, and U5b1e seems to have a more eastern distribution, perhaps the Balkans or the Ukraine. In any case, it is clear that U5b1 was extremely successful with more than 20 surviving lineages. This indicates that U5b1 lived at a time of rapid population growth. However, many of these lineages are currently represented by only a single FMS test result.
U5b1a has only 4 FMS test results and has an age estimate of about 10,000 years. There is one sample each from France and England, and two that are near matches from Poland and Russia. More samples are needed to estimate the age and geographic origins of U5b1a.
U5b1b is estimated to be about 11,000 years old and has 120 FMS test results with 98 of these in U5b1b1 (7200 ybp) and 20 in U5b1b2 (3000 ybp). There is also a single U5b1b* test result found in Russia, and a single test result that is pre-U5b1b1 (HM046248 from Spain) that has only one of the two mutations that define U5b1b1. It seems quite remarkable that virtually all of the U5b1b test results are in two major subclades U5b1b1 and U5b1b2. This indicates a population bottleneck with very slow growth in U5b1b for several thousand years followed by very rapid growth in U5b1b1 beginning about 7000 ybp and in U5b1b2 about 3000 ybp.
U5b1b1 is found throughout Europe and Africa. There are 18 test results that are U5b1b1* and these are found throughout Europe and also among the Berber people in north Africa. U5b1b1a (4000 ybp) is the largest subclade with 67 test results, and this is the so called “Saami signature” that is found at very high frequency among the Saami indigenous people of northern Scandinavia. However, U5b1b1a is also found frequently in eastern Europe with 7 test result from Belorus, Slovakia, Poland, Russia, Hungary, Bosnia and Croatia. One intriguing possibility is that U5b1b1a might indicate a common genetic ancestry among speakers of Uralic languages, including Finish and Hungarian. U5b1b1a has several named sister groups including U5b1b1b which has been found in Africa and Puerto Rico and might indicate a recent back migration of Europeans into Africa perhaps 3,000 years ago. U5b1b1d has only two FMS test results with ancestry in Italy and Spain. U5b1b1e has 3 FMS test results two of which have North African Berber ancestry. U5b1b1f has 4 FMS test result with ancestry in Germany, Italy, Russia and the Czech Republic.
U5b1b2 is estimated to be about 3,000 years old and has 21 FMS test results, mostly found in Finland, and 1 each in Ireland, Germany, Sweden and Norway. It is interesting that U5b1b1 is found throughout Europe and also in Africa, while U5b1b2 is relatively young and appears to be restricted primarily to Scandinavia. Did it arrive in Scandinavia together with U5b1b1a or did it have a different migration history?
U5b1c has 23 FMS test results and is estimated to be about 11,000 years old. There are 3 U5b1c* test results all of which have ancestry in Italy. The named subclades are U5b1c1 and U5b1c2. There are five U5b1c1 test results with ancestry in Italy, Spain, Scotland and the UK. U5b1c2 has 15 FMS test results, four are located in Ireland or UK, and one each in Spain, Poland and Croatia. It seems likely that U5b1c originated in Italy some 11,000 years ago and later expanded to other parts of Europe.
U5b1d is estimated to be about 12,000 years old and we have 15 FMS results. Those with known ancestry have been found in Italy, France, Ireland and Berber North African,
U5b1e is estimated to be about 8,000 years old and we have 14 FMS results mostly found in eastern Europe including Russia, Ukraine, and Slovakia. There is also one each found in Germany, Poland, England, Finland, Norway and the Czech Republic.
U5b1f has 4 FMS test results with 2 from Spain and one each from France and Germany. There are too few samples to predict the age with confidence, but an initial age estimate is between 4,000 to 8,000 years. If we predict membership in U5b1f based on HVR test results, this group appears to be found very frequently in Spain and among the Basque people. In a 2012 study of the Basque region by Behar et al, 12% of the Basques are in haplogroup U5b1f1a, while another 5% are in other subclades of U5.
U5b1g has 4 FGS test results and has only been found in Spain.
Haplogroup U5b2
(Last updated April 2013)
Haplogroup U5b2 has been estimated to be about 20,000 years old by Behar or about 22,000 years old by Soares, and it seems very likely that U5b2 was present in several different ice age refugia. U5b2 has 3 major subclades: U5b2a has 254 FMS test results and an age estimate of about 15,000 years. U5b2b is considerably smaller with 118 FMS test results and an age estimate of about 15,000 years. U5b2c has only 59 FMS test results and an age estimate of about 10,000 years. Finally, we have 5 U5b2* FMS samples including 4 U5b2*d samples from England, and one U5b2*e samplefrom India that might represent a branch of U5b2 that migrated to south Asia during the ice age. It would be very interesting to see more U5b2 test results from south Asia.
U5b2a1 has an age estimate of about 14,000 ybp and is widespread in Europe, including Russia. U5b2a1a has 51 FMS test results. It is estimated to be about 11,000 years old and is found throughout central and northern Europe. It also has a large number of samples and several named subclades. U5b2a1b has 12 FMS test results, an age estimate of about 3000 years and has samples found in Germany, England, Ireland, Poland, Czech and Russia, it seems likely that this subclade might have been present in an early Germanic tribe that subsequently spread into countries with some Germanic ancestry.
U5b2a2 (11,000 ybp) has 38 samples and is more frequent in central Europe (5 Germany, 4 Poland, 3 UK, 2 Netherlands, and 1 each Italy, Czech, Finland, Belarus, and 20 unspecified), and the lack of U5b2a2 in Russia might suggest an ice age refuge for U5b2a in Italy. U5b2a2 has a much younger age estimate, about 12,000 ybp, so this does suggest some uncertainty in the age of U5b2a, but that age estimate is dominated by 2 large subclades, and 3 U5b2a2* FMS results suggest an age of 19,000 years, so an age estimate of about 16,000 years seems reasonable for both U5b2a1 andU5b2a2.
U5b2a3 (11,000 ybp) has only 4 FMS test results with ancestry in Ireland, the UK and Germany.
U5b2a4 has 5 FMS samples but only 2 with known ancestry, in England and Norway.
U5b2a5 [updated Mar 2023] is defined by mutations at markers 8706, 10654, 11725 and 16311, and it has an estimated age of about 9000 years. There are 18 samples in three subclades: U5b2a5a is define by a mutation at marker 3394 and has 6 members from Finland with an estimated age of about 4000 years. U5b2a5*B has extra mutations at markers 9055 and 15940 and has four members from Denmark, Sweden, Bulgaria and Russia, and its subgroup U5b2a5*B1 has an extra mutation at marker 2581 and has seven members but only one with known ancestry from England. U5b2a5*C is defined by extra mutations at markers 5123, 13194 and 16223 and ha 5 members with ancestry in England and Ireland.
U5b2a6 has 5 FMS samples but none have known ancestry.
There is also one U5b2a* FMS results with ancestry in Spain. We have not found several old branches of U5b2a* in Spain (as is the case for U5b1), so an ice age refuge for U5b2a in Iberia seems less likely, while an ice age refugia in the Franco-Cantabrian region or Italy seems more probable. My guess is that U5b2a expanded from an ice age refuge into northern Europe at an early date and was largely replaced in southern Europe.
U5b2b has 59 FMS test results with 4 named subclades (U5b2b1 to U5b2b4) and it also has 4 distinct U5b2b* lineages that are not yet named. U5b2b has an age estimate of about 15,000 years and its present distribution seems shifted more to the west compared to U5b2a. There were only 2 U5b2b tests in the 2010 Malyarchuck et al. study (1 Russia and 1 Slovak), and we have many more U5b2b project members in western Europe. The four U5b2b* have ancestry in the Netherlands, Germany, UK & Sardinia, and Scotland & Ireland. Italy or the Franco-Cantabrian seem like possible ice age refuge origins for U5b2b.
U5b2b1 has an age estimate of about 10,000 years and 14 FMS test results with 4 from the UK, and one each from Germany, Poland, Russia and Slovakia.
U5b2b2 has an age estimate of 12,000 years based 4 FMS samples with 1 from England and 1 from France.
U5b2b3 has an age estimate of about 9,000 years with 13 FMS samples (including a U5b2b2* from France, U5b2b2a* from Spain and Portugal, and U5b2b2a1 from Ireland, the UK, Denmark and Germany).
U5b2b4 has an age estimate of 5200 years based on 20 FMS samples with 6 from England, 3 from Germany, and 1 each from the Netherlands, Norway, Sweden, Poland and Switzerland.
U5b2c has an age estimate of about 15,000 years based on 20 FMS samples. It has been found exclusively in western Europe. There is a one U5b2c* person with ancestry in Ireland.
U5b2c1 has 6 FMS samples including 2 from Spain, and one each from Ireland, England and Germany. One of the Spanish samples is from ancient human remains. Sanchez-Quinto et al. reported a FMS test result for the 7,000 year old remains of a Mesolithic hunter-gatherer at the La Brana-Arintero site which they identified as U5b2c1. Behar et al. estimated U5b2c1 to be about 4000 years old, although with large uncertainty in the date, while my age estimate for U5b2c1 based on the six modern FMS samples is 5,700 years. The La Brana-Arintero sample is at the upper end of the Behar uncertainty range and this raises the question of whether haplogroup ages might be older than estimated by Behar et al., and perhaps the slightly older estimates by Soares et al. might be more accurate. But it is not possible to reach conclusions from a single ancient DNA sample. The presence of U5b2c1 in Ireland and northwest Spain might be indicative of early population exchange between those areas.
U5b2c2 has an age estimate of 4800 years based on 20 FMS samples. This group includes 4 people from Ireland, 2 from Scotland and one each from England and Sweden. It seems likely that U5b2c had its origins in an Iberian or Franco-Cantabrian ice age refuge and arrived in the British Isles at a very early date, based on its frequency and diversity in Ireland.
Haplogroup U5b3
(Last updated April 2013)
Haplogroup U5b3 is relatively rare compared to its sister clades U5b1 and U5b2. U5b3 has been estimated by Behar et al. to be about 11,000 years old. We have 73 FMS test results for U5b3, however most of these are from research studies specifically designing to study the population of Sardinia. We have a much smaller number of U5b3 test results in the U5 project. Although U5b3 is quite rare, it also has great diversity. We have 7 named subclades of U5b3 (U5b3a to U5b3f, but several of these have only 2 or 3 members), and we also have nine U5b3* lineages that are each represented by a single individual. They have ancestry in Spain, France, Germany, northern Italy, Croatia, Bosnia and Czech. One of the key research papers on U5b3 is by Pala et al., “Mitochondrial haplogroup U5b3: a distant echo of the epipaleolithic in Italy and the legacy of the early Sardinians”, and they conclude that "the most likely homeland for U5b3 was the Italian Peninsula". The current distribution of U5b3* test results could be consistent with an origin in northern Italy or central Europe and a relatively late arrival in Sardinia, perhaps arriving with bronze-age metal workers.
U3b3a has 28 samples and an age estimate of about 10,000 years. U5b3a1a has an age estimate of 2500 years and has 17 samples (all from research studies) with 17 from the island of Sardinia, 1 from Italy and one unspecified. There are 2 samples of U5b3a1b also from research studies, one of which was from France. U5b3a2 has 9 samples and is more widely distributed and with an older age estimate of about 6500 years. It has been found in central Italy, France, Greece, Estonia, England and Morocco. Five of these test results are also from the Pala et al. research study.
U5b3b has 13 FMS samples and an age estimate of about 4800 years. It has 5 members in Group 1 with ancestry in Germany and England who share a mutation at 16526. There are also 5 members in Group 2 with ancestry in Spain, France, Greece and Czech. There are also three U5b3b* members with ancestry in central Italy, Scotland, and Norway. It's interesting that U5b3b3 is found so widely distributed in Europe given its relatively young age estimate.
U5b3c has only 3 samples, all from the Pala et al. study, with ancestry in Sardinia, southern Italy and Spain, and with an age estimate of about 3000 years.
U5b3d has only 2 samples, also from the Pala et al. study, with ancestry in southern Spain and Iraq.
U5b3e has 7 samples (including 4 from the Pala et al. study), 2 each with ancestry in England and the Netherlands, and the others from Germany, Czech and Bulgaria. U5b3e has and age estimate of about 4000 years.
U5b3f also has 7 samples (including 5 from the Pala et al. study), with 3 from central Italy and 1 from Spain and an age estimate of about 1000 years.
U5b3g has 4 samples, with one from southern Italy and the other three of unknown ancestry.
Case Studies of U5 Diversity in Finn and Basque Populations
The Saami and the Basque are both intensively studied populations, in part because they both speak non-Indo European languages, and this has led to the theory that they could represent the descendants of Mesolithic Europeans. Haplogroup U5 is also found at high percentages in both populations, with approximately 50% of Saami, 22% of Finns and 18% of Basques identified as U5. These percentages are significantly greater than other European populations who typically range from about 4% to 12% U5 with a European average of 9% U5 (based on the data from Richard et al., 2007, An mtDNA perspective of French genetic variation).
As of November 1 2012, we have 113 FMS test results from Finland (including the samples from the 1000 Genomes Project). It is interesting that there is very little diversity in the Finnish U5 distribution, especially when considering U5b. Nearly 40% of the Finnish U5 are in U5b1b1a (the "Saami motif"). This subclade is also found in eastern Europe and might have arrived in Finland by an eastern European route (perhaps along with haplogroup V as suggested by Tambets et al., 2004 (link). (Note that in their paper Tambets et al. refer to the Saami motif with 16144 as "U5b1b1").
23% of the Finnish U5 samples are in U5b1b2. This group seems to have a more western origin with 3 samples from Ireland or England and one each from Germany, Norway and Sweden. It might have arrived in Finland by a more westerly route. Perhaps some of the diversity in U5b was lost by bottlenecks and drift, although perhaps migration and replacement by eastern European and Asian mtDNA haplogroups are partly responsible for the lack of diversity in U5b in Finland. It is also interesting that 13% of the Finnish U5 are in U5a2a1. This group is found at low frequency throughout Europe and is most often found in northern, and eastern Europe, from Germany to Russia and Scandinavia. U5a2a have also been found in ancient remains in Germany dating to 8700 ybp, and ancient remains in Denmark dating to about 4000 ybp.
Table 1. Diversity in Finnish U5 full genome mtDNA samples.
N Subclade Percent
45 U5b1b1a 40%
26 U5b1b2 23%
11 U5b2a 10%
3 U5a1a1 3%
8 U5a1b 7%
15 U5a2a1 13%
3 U5a2b 3%
2 U5a2* 2%
These results suggest that the majority of Finnish U5 are represented by subclades that have relatively young age estimates, and the lack of diversity in older U5b subclades does not indicate a Mesolithic origin. However, it is possible that older and more diverse U5 subclades have been lost as a result of population bottlenecks and genetic drift.
A similar process seems to have occurred in the Basque population, with a large percentage of the Basque U5 mtDNA falling into a relatively young subclade U5b1f1a. In a 2012 study of the Basque by Behar et al., haplogroup U5 represented approximately 18% of the Basque population. However, about two-thirds of these were in U5b1f1a. (While Behar et al. did not compile statistics on haplogroup U5, this estimate is based on HVR data included in the supplement ot the paper). After sorting the samples by language, the highest percentage of U5b1f1a was found in the Basque and French speaking populations in the Basque region.
Table 2. Summary of U5b1f1a by language (based on data from Behar et al., 2012b)
Basque 75/599 = 12.5%
French 20/164 = 12.2%
Castillian 4/124 = 3.2%
In contrast, in the nearby region of Asturia, only one U5b1f1a was observed, 0.2% of the total sample size. When excluding U5b1f1a, the remaining U5 samples were very similar in the Basque and Asturian populations, with 5.6% of the population in U5 (excluding U5b1f1a) and with a similar distribution of U5 subclades in the two populations.
A 2013 study by Cardoso et al. performed full genome sequencing of ten of the Basque U5b1f1a samples, and this group has an age estimate of about 3000 ybp. This suggests that the high percentage of U5b1f1a (and thus U5) in the Basque is a result of a relatively recent U5b1f1a founder effect and population drift. Note that this conclusion is in contrast to that of Cardoso et al., who conclude that the presence of U5b1f indicates a pre-Neolithic origin for the Basque. However, this conclusion is not supported by data in the U5 project which shows the greatest diversity of U5b1f in Germany and very little diversity in the Basque U5b1f1a.
Haplogroup U5 population expansions and bottlenecks
The age estimates for U5 and its subclades are still uncertain, for example, age estimates for U5a and U5b vary from 22,000 years (Behar et al.) to 27,000 years (Soares et al.), and each of those estimates have uncertainty ranges of several thousand years. While we cannot yet be certain of their exact age, we can compare those estimates to important events that would have impacted U5 population growth and attempt to determine if periods of slow or rapid expansion of U5 subclades are consistent with those events. Key events included:
(1) the warm period from about 70,000 to 30,000 years ago, followed by a return to colder conditions;
(2) the Last Glacial Maximum (LGM) about 22,000-17,000 years ago;
(3) the shift to a warmer and moister inter-glacial period around 14,500 years ago;
(4) the Younger Dryas cold period from about 12,800 to 11,500 years ago, followed by a rapid return to warm conditions in Europe;
(5) the adoption of agriculture in the Near East during the Neolithic Revolution beginning about 12,000 years ago, and the expansion of farming populations from the Near East, beginning during the Neolithic about 8,500 years ago in southeastern Europe, but not fully expanding into northern Europe until about 5,000 years ago. (See link above and text below for a more complete description of the glacial period).
(6) the Bronze Age migration from the Steppe region in eastern Europe and west Asia that brought Indo-European languages to western Europe. Certain subclades of U5, including U5a1a1, have been found ancient remains of the Yamnaya Steppe people, and the high frequency of U5a1a1 in Europe today is likely a result of the Bronze Age migrations from the Steppe region.
We assume that U5 originated in Europe before the Last Glacial Maximum because of the estimated age of U5 of about 30,000 to 35,000 years, and because U5 has been found in ancient remains at the Dolni Vestonice burial site in the Czech Republic that has been dated to 31,155 years ago, and because the highest frequency and greatest diversity of U5 is found among present day people with European ancestry.
There are five additional mutations that distinguish U5 from U, with no other branch points or sister clades, so we can assume that U5 experienced a long period of very slow population growth or a population bottleneck in Europe. One possibility is that haplogroup U arrived in Europe with the first modern humans some 47,000 years ago, and U5 originated in Europe, accumulating its five extra mutations over a 15,000 year period in Europe. Other migration histories are possible, and ultimately we will need ancient DNA test result to know the actual migration history. Currently, the earliest certain evidence of U5 in Europe is mtDNA of ancient remains dating to about 12,000 years ago. The population bottleneck in U5 (i.e., the lack of sister clades from before 30,000 years ago) seems consistent with a population collapse that would have occurred during the LGM as U5 populations retreated into ice age refugia in southern Europe. Possible ice age refugia for U5 include the Iberian Peninsula, the Franco-Cantabrian Region, the Italian Peninsula, Greece and the Balkans, Anatolia, and Ukraine. If the U5a and U5b age estimates of 27,000 years are accurate, both would have been present in pre-Glacial populations in Europe and could also been present in multiple refugia. The fact that U5a and U5b have only two and three surviving lineages, respectively, suggests that they lived at a time of very slow population growth, and this seems consistent with the possibility of U5a1, U5a2, U5b1, U5b2 and U5b3 each having origins in southern Europe during the glacial period. Next, we see very rapid population growth of each of these subclades, as each has eight or more known surviving lineages. This seems consistent with these five groups and their subclades experiencing rapid population growth as they expanded into central and northern Europe as the ice retreated approximately 14,000 years ago. And we also have strong evidence based on mtDNA testing of the remains of ancient hunter-gatherers that U5 and its subclades (along with its sister groups U4, K and U2) were the dominant population groups in Europe during this period.
The next stage in the story of U5 begins with the Neolithic and the adoption of agriculture in Europe. There has been vigorous and controversial debate over the ancestry of the first European farmers. The theory of Demic Diffusion holds that early farmers migrated from the Near East into Europe and largely replaced the previous Mesolithic European populations. The theory of Cultural Diffusion holds that existing Mesolithic Europeans adopted agricultural technologies by a process of diffusion of cultural practices with limited migration of farmers from the Near East into Europe. Of course, some genetic mixing is possible with both theories but in demic diffusion, existing populations are mostly replaced while in cultural diffusion there is limited replacement. While the debate continues, genetic evidence seems to support the theory that Mesolithic Europeans were mostly replaced by multiple waves of migration from the Near East and West Asia. Cultural diffusionists believe that y-DNA haplogroup R1b and mtDNA haplogroup H represent the original Paleolithic populations of Europe. Migrationists believe that R1b and H represent Neolithic and Bronze Age immigrants who mostly replaced earlier European populations. DNA testing of remains of ancient remains show that most Mesolithic hunter-gatherers in Europe were mtDNA haplogroups U5 and U4. Other studies have also shown a lack of genetic continuity from European Neolithic farmers to present day Europeans and this suggests that there may have been multiple waves of migration and population replacement. However, because of the limited number of ancient DNA samples and challenges in accurately reading ancient DNA, there continues to be vigorous debate of these theories.
In a 2012 study by Fu et al., Complete Mitochondrial Genomes Reveal Neolithic Expansion into Europe, the authors analyze the dates at which haplogroups U and H underwent population expansions and contractions, and found "a population expansion between 15,000 and 10,000 years before present (YBP) in mtDNA typical for hunters and gatherers, with a decline between 10,000 and 5,000 YBP. These corresponded to an analogous population increase approximately 9,000 YBP for mtDNA typical of early farmers. The observed changes over time suggest that the spread of agriculture in Europe involved the expansion of farming populations into Europe followed by the eventual assimilation of resident hunter-gatherers."
The summary of the U5 subclades presented here is consistent with the conclusions of Fu et al. We see very rapid population expansion beginning with subclades of U5a1, U5a2, U5b1, U5b2 and U5b2 around 15,000 years ago. But what is most interesting is that certain of these subclades have very different patterns of population expansion than others. We have a large number of U5 subclade lineages that date to around 15,000 years ago that are represented by a single sample in the U5 project. In contrast, we have a few relatively young subclades that represent a large percentage of the U5 test results. For example, U5a1a1 with an age of 7,000 years has 94 members that represent 31% of all U5a1; U5a2a1 with an age of 6,000 years has 63 members that represent 37% of all U5a2; and U5b1b1a with an age estimate of 4,000 years has 66 members that represent 52% of all U5b1. (Also, U5b1f represents 65% of all U5 samples in another 2012 study by Behar et al. of the Basque people, however, U5b1f has not yet been reliable dated and could be older than the other large subclades discussed here.) Thus, a large number of U5 test results represent young subclades that began very rapid population expansion during the Neolithic. Several interesting questions remain: were these young, large U5 subclades part of Mesolithic populations who were adopted into groups of Neolithic immigrant communities as they began to expand into southeastern Europe? Were the older and more rare U5 subclades part of Mesolithic communities on the fringe of western or northern Europe who adopted Neolithic technologies at a much later date? Or are these differences the result of population drift or selection? Can the analysis of the 2012 Fu et al. study be repeated with exclusion of U5a1a1, U5a2a1 and U5b1b1a to see if this reveals a stronger signal of population differences between Mesolithic U5 and Neolithic immigrants? To what extent does over sampling of certain populations affect these result? For example, it seem likely that we have a much higher sample frequency from people of northwest European ancestry compared to other parts of Europe, Asia and Africa.
The Last Glacial Maximum
I adapted the following description of the LGM and Younger Dryas from the Oak Ridge National Laboratory's "A quick background to the last ice age":
After about 30,000 years ago, the Earth's climate system entered another big freeze-up; temperatures fell, deserts expanded and ice sheets spread across the northern latitudes. This cold and arid phase which reached its most extreme point sometime around 21,000-17,000 years ago is known as the Late Glacial Cold Stage. The point at which the global ice extent was at its greatest, about 21,000 years ago is known as the Last Glacial Maximum. The Last Glacial Maximum was much more arid than present almost everywhere, with desert and semi-desert occupying huge areas of the continents and forests shrunk back into refugia. But in fact, the greatest global aridity (rather than ice extent) may have been reached slightly after the Last Glacial Maximum, somewhere during the interval 19,000-17,000 years ago.
Warming, then a cold snap: Around 14,000 years ago, there was a rapid global warming and moistening of climates, perhaps occurring within the space of only a few years or decades. Conditions in many mid-latitude areas appear to have been about as warm as they are today, although many other areas - whilst warmer than during the Late Glacial Cold Stage - seem to have remained slightly cooler than at present. Forests began to spread back, and the ice sheets began to retreat. However, after a few thousand years of recovery, the Earth was suddenly plunged back into a new and very short-lived ice age known as the Younger Dryas. Although the Younger Dryas did not affect everywhere in the world, it destroyed the returning forests in the north and led to a brief resurgence of the ice sheets. The main cooling event that marks the beginning of the Younger Dryas seems to have occurred within less than 100 years, according to Greenland ice core data. After about 1,300 years of cold and aridity, the Younger Dryas seems to have ended in the space of only a few decades when conditions became as warm as they are today .
List of Ancient mtDNA U5 Samples
Updated 1//1/2023
Subclade, age (approximate years before present), Lead Author, GenBank ID other ID, Culture/Location: list of extra mutations
Updates on the Project
April-June 2011
The three U5 projects are reorganized to collaborate. All U5 members will be part of the main U5 project, and people who tested the full mtDNA sequence can also join the U5a or U5b FMS projects.
Earlier Project Updates
4 Sept 2010
We have found 3 new daughter groups of U5a2 among the U5 project members. Here are the current results of all U5a2 including both the U5 project and all full mtDNA sequences published in GenBank:
U5a2a has 44 members and is widely distributed throughout Europe
U5a2b has 37 members with a dominant daughter of 28 members that is mostly located in central/eastern Europe
U5a2c has 12 members and is mostly located mostly in western Europe
U5a2d has 2 members with one of them in Finland
U5a2e has 3 members in eastern Europe (Slovenia, Belarus, Czech)
The 3 new unnamed daughter groups of U5a2 have origins in France, Portugal and England
11 Aug 2010
We currently have 840 U5 members in the project including 441 in U5a, 397 in U5b, and 2 not assigned to a U5a or U5b. U5a is mostly U5a1 with 312 members and there are 129 members in U5a2. U5b subgroups are more difficult to predict, but there are currently 70 members in U5b1, 90 members in U5b2, and 24 members in Ub3. Another 56 people are in the U5a project including 47 U5a and 9 U5b. Combining both projects we have 896 U5 test results. Approximately 200 U5 people have done the full mtDNA sequence test (FGS or FMS).
15 July 2010
Jørgen , Gail and Andreas have joined as co-administrators for the U5 project to organize the results and update the trees for U5a1, U5a2 and U5b, respectively. All project members have been placed in groups based on results of the Full mtDNA sequence or predicted groups based on HVR1 and HVR2 results.
U5a1* Group I is a newly discovered and fairly rare branch of U5a1. We only have 9 test results and Group I is estimated to be about 10,000 years old. Seven of these people share 2 extra mutations and this Group I1 is estimated to be about 5,000 years old. U5a1* Group I is rather unusual in that it has a mix of 6 people with north European ancestry (Norway, Germany, UK) and 2 people with south Asian ancestry. My guess is that Group I was present among early hunter-gatherers who repopulated Europe after the last glacial maximum, about 10,000 years ago. We will need more test results to determine if the south Asian branches of Group I reflect ancient or recent migrations from Europe to south Asia.
Summary of U5a1: While the precise age and geographic origin of U5a1 remains uncertain, we know that U5a1 lived during a time of more rapid population expansion because it has 14 known daughter lineages, including nine named subclades and five lineages not yet named. The greatest diversity of U5a1 seems to be in central and northern Europe (note that the five very rare unnamed U5a1* lineages have been found in Italy, Tyrol, Germany and Poland). The two dominant subclades U5a1a and U5a1b represent 70% of all U5a1 samples, while the other U5a1 subclades are found much less frequently. This suggests that U5a1a and U5a1b might have been present in populations that began to grow rapidly perhaps around the beginning of the Neolithic period. The presence of U5a1a1 in Ancient Yamnaya culture remains support the theory that U5a1a1 originated in the eastern European steppe and arrived in western Europe with the migrations that brought Indo-European languages to western Europe. Other U5a1 subclades might represent remnants of hunter-gather populations that adopted Neolithic farming and herding practices at a later date. Some U5a1 samples, including U5a1d2b and "U5a1* Group I" have been found in central Asian (including ancient remains) and India, and these samples probably represent early migrations of U5a1 populations from Europe into central Asia. It is likely that additional very rare subclades of U5a1 still remain to be discovered, and additional testing of present day populations and ancient remains will lead to a more complete description of the history of U5a1 in Europe.
Haplogroup U5a2
(update in progress - Nov 1, 2014)
U5a2 has been estimated to be around 20,000 years old and it has 316 FMS test results. U5a2 has five named subclades (U5a2a to U5a2e). It also has three unnamed subclades, U5a2* Group F with ancestry in France and Moldova, U5a2* Group G with ancestry in Italy, and U5a2* Group H with ancestry in England.
U5a2a is estimated to be about 12,000 years old. It has 117 FMS test results, but 104 of these are in a single subclade U5a2a1 estimated to be about 6,000 years old. The 13 U5a2a* samples represent three different subclades of U5a2a that are found mostly in northern Europe. U5a2a has a distinctive HVR1 signature, and 2 sets of ancient remains have been identified as U5a2a based on HVR1 test results: remains from Hohlenstein-Stadel, Germany dated to 8,700 years ago, and another set of remains from Damsbo, Denmark dated to 4,200 years ago. The Hohlenstein-Stadel sample appears to be a close match to one of the U5a2a* members of the U5 project. U5a2a is interesting because 97% of its samples are in U5a2a1, only one of the four surviving U5a2a lineages. One possible interpretation is that U5a2a1 originated among eastern European hunter gatherers and in the forest steppe region, where it underwent rapid population expansion beginning about 6,000 years ago, and then expanded into western Europe with the migration of Indo-European speakers. It is possible that the remaining hunter-gatherer U5a2a lineages in western Europe were mostly replaced by Neolithic and Bronze age immigrants (including U5a2a1), and therefore other subclades of U5a2a are found at very low frequency in northern Europe today.
U5a2a1 is a very diverse group with 13 named or proposed subclades, There are an additional 18 U5a2a1* lineages that are represented by a single sample and 9 samples that are U5a2a1 with no additional mutations. U5a2a1* is found throughout northern Europe, with 13 from Finland, 11 from Russia, 5 from Germany, 4 from the UK, 3 from Poland, 2 from Ireland, 3 from Sweden, and one each from France, Spain, Switzerland, Belorus, Ukraine, India and one Korak from far eastern Russia. Among its named subclades, U5a2a1a and U5a2a1e are found in Finland, U5a2a1b is found in Russia and Ukraine, U5a2a1c is found in Russia and Belorus, and U5a2a1d is found in Spain, England, Wales and France. U5a2a1 Group F is found in Ireland , Russia and a Koryak from far eastern Russia. U5a2a1 Group G is found in Germany and Spain. U5a2a1 Group H is found in Denmark, Sweden and Scotland. U5a2a1 Group I is found in Poland and Russia. U5a2a1 Group J is found in Sweden, Finland, Russia, Germany and England. U5a2a1 Group K is found in Ireland. U5a2a1 Group L is found in the UK. U5a2a1 Group M is found in Russia and Belarus.
U5a2a2 has 16 samples with an age estimate of about 8000 years. U5a2a2a has 11 samples from Denmark, Finland and Germany. U5a2a2 Group B has 5 samples from England and Ireland.
U5a2a* Group 3 has 4 samples from Denmark, Sweden and Russia and has an age estimate of about 3000 years.
U5a2a* Group 4 has 1 sample from a person of European ancestry.
U5a2b has 52 FMS test results and is estimated to be about 12,000 years old. It has 4 named subclades and also nine U5a2b* test results that represent 7 different un-named lineages, 4 of which have ancestry in Germany, Italy, Russia and Tunisia. U5a2b1 has 22 test results including 11 in U5a2b1* of which two are from Germany and two from Russia, and one each from Portugal, Norway, Poland, Czech and Ukraine. There are 3 people in U5a2b1a* with ancestry in France, Sicily and Belarus and there are 4 people in U5a2b1a* Group 1 with two from Russia and one each from Germany and Poland. There are also 4 people in U5a2b1b with 2 from Germany and 1 from Switzerland. There are 7 people in U5a2b2 with one each from Belarus, Slovakia, Poland, Ukraine and the Italian Alps. There are 7 people in U5a2b3 with two each from England and Finland and one each from Italy and Germany. There are 7 people in U5a2b4 with two one each from Ireland and Norway.
U5a2c has 111 FMS test results and is estimated to be about 12,000 years old. There are eleven U5a2c* test results, from France, Italy, Ireland and Spain. There are also two ancient U5a2c samples from Denmark and Germany dated at about 10,000 years ago. U5a2c has four named subclades. U5a2c1 (3800 ybp) has 29 test results mostly from northern Europe (Germany, Denmark, Sweden, Ireland, Scotland, England) and two samples from Spain and one from Tunisia. U5a2c2 has only 3 test results from Italy and Finland. U5a2c3 has an ancient sample from Germany dated at 10,600 years ago and 51 samples including 42 in U5a2c3a from northern Europe and nine U5a2c3*b samples from England. U5a2c3a also has two ancient samples from England and France dated at about 4200 years ago. There are also 18 U5a2c4 test results from northern Europe. Given that U5a2c and its subclades are mostly found in northern and western Europe, including ancient Mesolithic hunter-gather samples, my guess is that it originated in western/central Europe after the last glacial maximum when hunter-gatherers expanded into northern Europe. Given the low frequency in Europe today and the lack of U5a2c in eastern Europe and Asia, it seems likely that U5a2c hunter-gatherers were mostly replaced by Neolithic and Bronze Age migrations into Europe. [updated May 2021]
U5a2d (updated February 2021) Behar et al. estimated U5a2d to be about 17,000 years old, but the U5 project samples indicate an age of about 18,500 years. The U5 project has 53 modern FMS test results and there are also five ancient samples.
The ancient samples include 4 Mesolithic samples from northern Europe: two from Motala Sweden dated at 7700 years ago, one from Latvia dated at 7000 years ago, and one from Ireland dated at 6100 years ago. There is an ancient Corded Ware sample from Germany dated at 4400 years ago, and this ancient sample is in U5a2d* Group 3 and shares extra mutations with 3 project members from Germany and Portugal.
There are 53 modern U5a2d samples, these include seven in U5a2d* Group 2 with ancestry from England, Scotland, the UK. Ukraine and Slovakia; two in U5a2d* Group 4 with ancestry from Italy and Armenia; and two in U5a2d* Group 5 with unknown maternal ancestry. There are 42 people in U5a2d1 with and estimated age of about 7000 years. These include 32 people in U5a2d1a (2300 years old) mostly with Scandinavian ancestry, and 8 people in U5a2d1* group B (1600 years old) with ancestry in England, Scotland and Ireland. Based on the ancient samples and modern distribution, U5a2d had an Ice Age origin in Europe about 18,000 years ago and expanded into northern Europe with early hunter-gatherers as the Ice retreated. U5a2d was probably widespread in Mesolithic European hunter-gatherers but was mostly replaced by Neolithic and Bronze age migrations. Remants of U5a2d are mostly found today in Scandinavia, the UK and Ireland with smaller pockets in other areas of Europe.
U5a2e has 7 test results and an age estimate of 10,000 years. There is one U5a2e* from Finland, one U5a2e1* who is Czech, and five U5a2e1 with two Czechs and one each from Austria, Slovenia and Belorus. The sample size is small but we see a possible connection here between Finns and southeastern Europe, also as discussed for U5b1b1a.
Summary of U5a2: U5a2 is found much less frequently than U5a1, but U5a2 also lived during a time of more rapid population expansion because it has 7 known daughter lineages, including five named subclades and two lineages not yet named. As in the case of U5a1, the majority of U5a2 samples (69%) are in its two largest subclades, U5a2a and U5a2b, and 37% of all U5a2 samples are in U5a2a1 which is dated to about 6000 ybp, suggesting that U5a2a1 lived in a Neolithic population that expanded very rapidly. U5a2 is found most frequently in northern and eastern Europe, including Russia. It is possible that U5a2 was present in multiple ice age refugia. Some of the less common subclades of U5a2 are found primarily in western Europe and may have been present in an ice age refuge in western Europe, while U5a2a and U5a2b are found more frequently in the northern regions of central and eastern Europe, and perhaps were present in an ice age refuge in the Balkans or Italy. From ancient remains we know that U5a2a was already present in Germany 8700 ybp. Another possibility is that U5a2a was present in a southern European ice age refuge, and initially expanded into central and northern Europe as the ice retreated, and then expanded into eastern Europe and Russia. The fact that U5a2 is found infrequently in southern Europe suggests that it was not present in early Neolithic farming communities that expanded from the Near East into Europe. If U5a2a1 was not present among early farmers, perhaps its high frequency in northern Europe today and its rapid expansion 6000 years ago might suggest that U5a2a1 was present in early Neolithic herding communities in eastern and northern Europe? More testing of ancient remains will be needed to better understand the migration history of U5a2.
Haplogroup U5b1
(Last updated April 2013)
U5b1 has 232 FMS samples with 6 named subclades (U5b1a to U5b1f) and there are more than 20 additional U5b1* FMS test results that do not belong to any of the named subclades. These 20 test results represent 15 additional distinct daughters of U5b1, thus, U5b1 has by far the greatest diversity of the five major U5 subclades. A large number of these U5b1* samples have been found in Spain which suggests a possible Iberian origin for U5b1. Single samples of U5b1* test results have also been found in Scotland, England, Ireland, the UK, Germany, Croatia and Belorus. What can we conclude about the age and origins of U5b1? There remains uncertainty in U5b1 age estimates, in the range of 16,000 to 24,000 years, and it is challenging to infer ancient origins from current population distributions. It is possible that U5b1 was widespread in Europe before the last glacial maximum and that it retreated to ice age refugia throughout southern Europe. This would explain why some subclades of U5b1 seem to originate in Iberia, while U5b1c seems to originate in Italy, and U5b1e seems to have a more eastern distribution, perhaps the Balkans or the Ukraine. In any case, it is clear that U5b1 was extremely successful with more than 20 surviving lineages. This indicates that U5b1 lived at a time of rapid population growth. However, many of these lineages are currently represented by only a single FMS test result.
U5b1a has only 4 FMS test results and has an age estimate of about 10,000 years. There is one sample each from France and England, and two that are near matches from Poland and Russia. More samples are needed to estimate the age and geographic origins of U5b1a.
U5b1b is estimated to be about 11,000 years old and has 120 FMS test results with 98 of these in U5b1b1 (7200 ybp) and 20 in U5b1b2 (3000 ybp). There is also a single U5b1b* test result found in Russia, and a single test result that is pre-U5b1b1 (HM046248 from Spain) that has only one of the two mutations that define U5b1b1. It seems quite remarkable that virtually all of the U5b1b test results are in two major subclades U5b1b1 and U5b1b2. This indicates a population bottleneck with very slow growth in U5b1b for several thousand years followed by very rapid growth in U5b1b1 beginning about 7000 ybp and in U5b1b2 about 3000 ybp.
U5b1b1 is found throughout Europe and Africa. There are 18 test results that are U5b1b1* and these are found throughout Europe and also among the Berber people in north Africa. U5b1b1a (4000 ybp) is the largest subclade with 67 test results, and this is the so called “Saami signature” that is found at very high frequency among the Saami indigenous people of northern Scandinavia. However, U5b1b1a is also found frequently in eastern Europe with 7 test result from Belorus, Slovakia, Poland, Russia, Hungary, Bosnia and Croatia. One intriguing possibility is that U5b1b1a might indicate a common genetic ancestry among speakers of Uralic languages, including Finish and Hungarian. U5b1b1a has several named sister groups including U5b1b1b which has been found in Africa and Puerto Rico and might indicate a recent back migration of Europeans into Africa perhaps 3,000 years ago. U5b1b1d has only two FMS test results with ancestry in Italy and Spain. U5b1b1e has 3 FMS test results two of which have North African Berber ancestry. U5b1b1f has 4 FMS test result with ancestry in Germany, Italy, Russia and the Czech Republic.
U5b1b2 is estimated to be about 3,000 years old and has 21 FMS test results, mostly found in Finland, and 1 each in Ireland, Germany, Sweden and Norway. It is interesting that U5b1b1 is found throughout Europe and also in Africa, while U5b1b2 is relatively young and appears to be restricted primarily to Scandinavia. Did it arrive in Scandinavia together with U5b1b1a or did it have a different migration history?
U5b1c has 23 FMS test results and is estimated to be about 11,000 years old. There are 3 U5b1c* test results all of which have ancestry in Italy. The named subclades are U5b1c1 and U5b1c2. There are five U5b1c1 test results with ancestry in Italy, Spain, Scotland and the UK. U5b1c2 has 15 FMS test results, four are located in Ireland or UK, and one each in Spain, Poland and Croatia. It seems likely that U5b1c originated in Italy some 11,000 years ago and later expanded to other parts of Europe.
U5b1d is estimated to be about 12,000 years old and we have 15 FMS results. Those with known ancestry have been found in Italy, France, Ireland and Berber North African,
U5b1e is estimated to be about 8,000 years old and we have 14 FMS results mostly found in eastern Europe including Russia, Ukraine, and Slovakia. There is also one each found in Germany, Poland, England, Finland, Norway and the Czech Republic.
U5b1f has 4 FMS test results with 2 from Spain and one each from France and Germany. There are too few samples to predict the age with confidence, but an initial age estimate is between 4,000 to 8,000 years. If we predict membership in U5b1f based on HVR test results, this group appears to be found very frequently in Spain and among the Basque people. In a 2012 study of the Basque region by Behar et al, 12% of the Basques are in haplogroup U5b1f1a, while another 5% are in other subclades of U5.
U5b1g has 4 FGS test results and has only been found in Spain.
Haplogroup U5b2
(Last updated April 2013)
Haplogroup U5b2 has been estimated to be about 20,000 years old by Behar or about 22,000 years old by Soares, and it seems very likely that U5b2 was present in several different ice age refugia. U5b2 has 3 major subclades: U5b2a has 254 FMS test results and an age estimate of about 15,000 years. U5b2b is considerably smaller with 118 FMS test results and an age estimate of about 15,000 years. U5b2c has only 59 FMS test results and an age estimate of about 10,000 years. Finally, we have 5 U5b2* FMS samples including 4 U5b2*d samples from England, and one U5b2*e samplefrom India that might represent a branch of U5b2 that migrated to south Asia during the ice age. It would be very interesting to see more U5b2 test results from south Asia.
U5b2a1 has an age estimate of about 14,000 ybp and is widespread in Europe, including Russia. U5b2a1a has 51 FMS test results. It is estimated to be about 11,000 years old and is found throughout central and northern Europe. It also has a large number of samples and several named subclades. U5b2a1b has 12 FMS test results, an age estimate of about 3000 years and has samples found in Germany, England, Ireland, Poland, Czech and Russia, it seems likely that this subclade might have been present in an early Germanic tribe that subsequently spread into countries with some Germanic ancestry.
U5b2a2 (11,000 ybp) has 38 samples and is more frequent in central Europe (5 Germany, 4 Poland, 3 UK, 2 Netherlands, and 1 each Italy, Czech, Finland, Belarus, and 20 unspecified), and the lack of U5b2a2 in Russia might suggest an ice age refuge for U5b2a in Italy. U5b2a2 has a much younger age estimate, about 12,000 ybp, so this does suggest some uncertainty in the age of U5b2a, but that age estimate is dominated by 2 large subclades, and 3 U5b2a2* FMS results suggest an age of 19,000 years, so an age estimate of about 16,000 years seems reasonable for both U5b2a1 andU5b2a2.
U5b2a3 (11,000 ybp) has only 4 FMS test results with ancestry in Ireland, the UK and Germany.
U5b2a4 has 5 FMS samples but only 2 with known ancestry, in England and Norway.
U5b2a5 [updated Mar 2023] is defined by mutations at markers 8706, 10654, 11725 and 16311, and it has an estimated age of about 9000 years. There are 18 samples in three subclades: U5b2a5a is define by a mutation at marker 3394 and has 6 members from Finland with an estimated age of about 4000 years. U5b2a5*B has extra mutations at markers 9055 and 15940 and has four members from Denmark, Sweden, Bulgaria and Russia, and its subgroup U5b2a5*B1 has an extra mutation at marker 2581 and has seven members but only one with known ancestry from England. U5b2a5*C is defined by extra mutations at markers 5123, 13194 and 16223 and ha 5 members with ancestry in England and Ireland.
U5b2a6 has 5 FMS samples but none have known ancestry.
There is also one U5b2a* FMS results with ancestry in Spain. We have not found several old branches of U5b2a* in Spain (as is the case for U5b1), so an ice age refuge for U5b2a in Iberia seems less likely, while an ice age refugia in the Franco-Cantabrian region or Italy seems more probable. My guess is that U5b2a expanded from an ice age refuge into northern Europe at an early date and was largely replaced in southern Europe.
U5b2b has 59 FMS test results with 4 named subclades (U5b2b1 to U5b2b4) and it also has 4 distinct U5b2b* lineages that are not yet named. U5b2b has an age estimate of about 15,000 years and its present distribution seems shifted more to the west compared to U5b2a. There were only 2 U5b2b tests in the 2010 Malyarchuck et al. study (1 Russia and 1 Slovak), and we have many more U5b2b project members in western Europe. The four U5b2b* have ancestry in the Netherlands, Germany, UK & Sardinia, and Scotland & Ireland. Italy or the Franco-Cantabrian seem like possible ice age refuge origins for U5b2b.
U5b2b1 has an age estimate of about 10,000 years and 14 FMS test results with 4 from the UK, and one each from Germany, Poland, Russia and Slovakia.
U5b2b2 has an age estimate of 12,000 years based 4 FMS samples with 1 from England and 1 from France.
U5b2b3 has an age estimate of about 9,000 years with 13 FMS samples (including a U5b2b2* from France, U5b2b2a* from Spain and Portugal, and U5b2b2a1 from Ireland, the UK, Denmark and Germany).
U5b2b4 has an age estimate of 5200 years based on 20 FMS samples with 6 from England, 3 from Germany, and 1 each from the Netherlands, Norway, Sweden, Poland and Switzerland.
U5b2c has an age estimate of about 15,000 years based on 20 FMS samples. It has been found exclusively in western Europe. There is a one U5b2c* person with ancestry in Ireland.
U5b2c1 has 6 FMS samples including 2 from Spain, and one each from Ireland, England and Germany. One of the Spanish samples is from ancient human remains. Sanchez-Quinto et al. reported a FMS test result for the 7,000 year old remains of a Mesolithic hunter-gatherer at the La Brana-Arintero site which they identified as U5b2c1. Behar et al. estimated U5b2c1 to be about 4000 years old, although with large uncertainty in the date, while my age estimate for U5b2c1 based on the six modern FMS samples is 5,700 years. The La Brana-Arintero sample is at the upper end of the Behar uncertainty range and this raises the question of whether haplogroup ages might be older than estimated by Behar et al., and perhaps the slightly older estimates by Soares et al. might be more accurate. But it is not possible to reach conclusions from a single ancient DNA sample. The presence of U5b2c1 in Ireland and northwest Spain might be indicative of early population exchange between those areas.
U5b2c2 has an age estimate of 4800 years based on 20 FMS samples. This group includes 4 people from Ireland, 2 from Scotland and one each from England and Sweden. It seems likely that U5b2c had its origins in an Iberian or Franco-Cantabrian ice age refuge and arrived in the British Isles at a very early date, based on its frequency and diversity in Ireland.
Haplogroup U5b3
(Last updated April 2013)
Haplogroup U5b3 is relatively rare compared to its sister clades U5b1 and U5b2. U5b3 has been estimated by Behar et al. to be about 11,000 years old. We have 73 FMS test results for U5b3, however most of these are from research studies specifically designing to study the population of Sardinia. We have a much smaller number of U5b3 test results in the U5 project. Although U5b3 is quite rare, it also has great diversity. We have 7 named subclades of U5b3 (U5b3a to U5b3f, but several of these have only 2 or 3 members), and we also have nine U5b3* lineages that are each represented by a single individual. They have ancestry in Spain, France, Germany, northern Italy, Croatia, Bosnia and Czech. One of the key research papers on U5b3 is by Pala et al., “Mitochondrial haplogroup U5b3: a distant echo of the epipaleolithic in Italy and the legacy of the early Sardinians”, and they conclude that "the most likely homeland for U5b3 was the Italian Peninsula". The current distribution of U5b3* test results could be consistent with an origin in northern Italy or central Europe and a relatively late arrival in Sardinia, perhaps arriving with bronze-age metal workers.
U3b3a has 28 samples and an age estimate of about 10,000 years. U5b3a1a has an age estimate of 2500 years and has 17 samples (all from research studies) with 17 from the island of Sardinia, 1 from Italy and one unspecified. There are 2 samples of U5b3a1b also from research studies, one of which was from France. U5b3a2 has 9 samples and is more widely distributed and with an older age estimate of about 6500 years. It has been found in central Italy, France, Greece, Estonia, England and Morocco. Five of these test results are also from the Pala et al. research study.
U5b3b has 13 FMS samples and an age estimate of about 4800 years. It has 5 members in Group 1 with ancestry in Germany and England who share a mutation at 16526. There are also 5 members in Group 2 with ancestry in Spain, France, Greece and Czech. There are also three U5b3b* members with ancestry in central Italy, Scotland, and Norway. It's interesting that U5b3b3 is found so widely distributed in Europe given its relatively young age estimate.
U5b3c has only 3 samples, all from the Pala et al. study, with ancestry in Sardinia, southern Italy and Spain, and with an age estimate of about 3000 years.
U5b3d has only 2 samples, also from the Pala et al. study, with ancestry in southern Spain and Iraq.
U5b3e has 7 samples (including 4 from the Pala et al. study), 2 each with ancestry in England and the Netherlands, and the others from Germany, Czech and Bulgaria. U5b3e has and age estimate of about 4000 years.
U5b3f also has 7 samples (including 5 from the Pala et al. study), with 3 from central Italy and 1 from Spain and an age estimate of about 1000 years.
U5b3g has 4 samples, with one from southern Italy and the other three of unknown ancestry.
Case Studies of U5 Diversity in Finn and Basque Populations
The Saami and the Basque are both intensively studied populations, in part because they both speak non-Indo European languages, and this has led to the theory that they could represent the descendants of Mesolithic Europeans. Haplogroup U5 is also found at high percentages in both populations, with approximately 50% of Saami, 22% of Finns and 18% of Basques identified as U5. These percentages are significantly greater than other European populations who typically range from about 4% to 12% U5 with a European average of 9% U5 (based on the data from Richard et al., 2007, An mtDNA perspective of French genetic variation).
As of November 1 2012, we have 113 FMS test results from Finland (including the samples from the 1000 Genomes Project). It is interesting that there is very little diversity in the Finnish U5 distribution, especially when considering U5b. Nearly 40% of the Finnish U5 are in U5b1b1a (the "Saami motif"). This subclade is also found in eastern Europe and might have arrived in Finland by an eastern European route (perhaps along with haplogroup V as suggested by Tambets et al., 2004 (link). (Note that in their paper Tambets et al. refer to the Saami motif with 16144 as "U5b1b1").
23% of the Finnish U5 samples are in U5b1b2. This group seems to have a more western origin with 3 samples from Ireland or England and one each from Germany, Norway and Sweden. It might have arrived in Finland by a more westerly route. Perhaps some of the diversity in U5b was lost by bottlenecks and drift, although perhaps migration and replacement by eastern European and Asian mtDNA haplogroups are partly responsible for the lack of diversity in U5b in Finland. It is also interesting that 13% of the Finnish U5 are in U5a2a1. This group is found at low frequency throughout Europe and is most often found in northern, and eastern Europe, from Germany to Russia and Scandinavia. U5a2a have also been found in ancient remains in Germany dating to 8700 ybp, and ancient remains in Denmark dating to about 4000 ybp.
Table 1. Diversity in Finnish U5 full genome mtDNA samples.
N Subclade Percent
45 U5b1b1a 40%
26 U5b1b2 23%
11 U5b2a 10%
3 U5a1a1 3%
8 U5a1b 7%
15 U5a2a1 13%
3 U5a2b 3%
2 U5a2* 2%
These results suggest that the majority of Finnish U5 are represented by subclades that have relatively young age estimates, and the lack of diversity in older U5b subclades does not indicate a Mesolithic origin. However, it is possible that older and more diverse U5 subclades have been lost as a result of population bottlenecks and genetic drift.
A similar process seems to have occurred in the Basque population, with a large percentage of the Basque U5 mtDNA falling into a relatively young subclade U5b1f1a. In a 2012 study of the Basque by Behar et al., haplogroup U5 represented approximately 18% of the Basque population. However, about two-thirds of these were in U5b1f1a. (While Behar et al. did not compile statistics on haplogroup U5, this estimate is based on HVR data included in the supplement ot the paper). After sorting the samples by language, the highest percentage of U5b1f1a was found in the Basque and French speaking populations in the Basque region.
Table 2. Summary of U5b1f1a by language (based on data from Behar et al., 2012b)
Basque 75/599 = 12.5%
French 20/164 = 12.2%
Castillian 4/124 = 3.2%
In contrast, in the nearby region of Asturia, only one U5b1f1a was observed, 0.2% of the total sample size. When excluding U5b1f1a, the remaining U5 samples were very similar in the Basque and Asturian populations, with 5.6% of the population in U5 (excluding U5b1f1a) and with a similar distribution of U5 subclades in the two populations.
A 2013 study by Cardoso et al. performed full genome sequencing of ten of the Basque U5b1f1a samples, and this group has an age estimate of about 3000 ybp. This suggests that the high percentage of U5b1f1a (and thus U5) in the Basque is a result of a relatively recent U5b1f1a founder effect and population drift. Note that this conclusion is in contrast to that of Cardoso et al., who conclude that the presence of U5b1f indicates a pre-Neolithic origin for the Basque. However, this conclusion is not supported by data in the U5 project which shows the greatest diversity of U5b1f in Germany and very little diversity in the Basque U5b1f1a.
Haplogroup U5 population expansions and bottlenecks
The age estimates for U5 and its subclades are still uncertain, for example, age estimates for U5a and U5b vary from 22,000 years (Behar et al.) to 27,000 years (Soares et al.), and each of those estimates have uncertainty ranges of several thousand years. While we cannot yet be certain of their exact age, we can compare those estimates to important events that would have impacted U5 population growth and attempt to determine if periods of slow or rapid expansion of U5 subclades are consistent with those events. Key events included:
(1) the warm period from about 70,000 to 30,000 years ago, followed by a return to colder conditions;
(2) the Last Glacial Maximum (LGM) about 22,000-17,000 years ago;
(3) the shift to a warmer and moister inter-glacial period around 14,500 years ago;
(4) the Younger Dryas cold period from about 12,800 to 11,500 years ago, followed by a rapid return to warm conditions in Europe;
(5) the adoption of agriculture in the Near East during the Neolithic Revolution beginning about 12,000 years ago, and the expansion of farming populations from the Near East, beginning during the Neolithic about 8,500 years ago in southeastern Europe, but not fully expanding into northern Europe until about 5,000 years ago. (See link above and text below for a more complete description of the glacial period).
(6) the Bronze Age migration from the Steppe region in eastern Europe and west Asia that brought Indo-European languages to western Europe. Certain subclades of U5, including U5a1a1, have been found ancient remains of the Yamnaya Steppe people, and the high frequency of U5a1a1 in Europe today is likely a result of the Bronze Age migrations from the Steppe region.
We assume that U5 originated in Europe before the Last Glacial Maximum because of the estimated age of U5 of about 30,000 to 35,000 years, and because U5 has been found in ancient remains at the Dolni Vestonice burial site in the Czech Republic that has been dated to 31,155 years ago, and because the highest frequency and greatest diversity of U5 is found among present day people with European ancestry.
There are five additional mutations that distinguish U5 from U, with no other branch points or sister clades, so we can assume that U5 experienced a long period of very slow population growth or a population bottleneck in Europe. One possibility is that haplogroup U arrived in Europe with the first modern humans some 47,000 years ago, and U5 originated in Europe, accumulating its five extra mutations over a 15,000 year period in Europe. Other migration histories are possible, and ultimately we will need ancient DNA test result to know the actual migration history. Currently, the earliest certain evidence of U5 in Europe is mtDNA of ancient remains dating to about 12,000 years ago. The population bottleneck in U5 (i.e., the lack of sister clades from before 30,000 years ago) seems consistent with a population collapse that would have occurred during the LGM as U5 populations retreated into ice age refugia in southern Europe. Possible ice age refugia for U5 include the Iberian Peninsula, the Franco-Cantabrian Region, the Italian Peninsula, Greece and the Balkans, Anatolia, and Ukraine. If the U5a and U5b age estimates of 27,000 years are accurate, both would have been present in pre-Glacial populations in Europe and could also been present in multiple refugia. The fact that U5a and U5b have only two and three surviving lineages, respectively, suggests that they lived at a time of very slow population growth, and this seems consistent with the possibility of U5a1, U5a2, U5b1, U5b2 and U5b3 each having origins in southern Europe during the glacial period. Next, we see very rapid population growth of each of these subclades, as each has eight or more known surviving lineages. This seems consistent with these five groups and their subclades experiencing rapid population growth as they expanded into central and northern Europe as the ice retreated approximately 14,000 years ago. And we also have strong evidence based on mtDNA testing of the remains of ancient hunter-gatherers that U5 and its subclades (along with its sister groups U4, K and U2) were the dominant population groups in Europe during this period.
The next stage in the story of U5 begins with the Neolithic and the adoption of agriculture in Europe. There has been vigorous and controversial debate over the ancestry of the first European farmers. The theory of Demic Diffusion holds that early farmers migrated from the Near East into Europe and largely replaced the previous Mesolithic European populations. The theory of Cultural Diffusion holds that existing Mesolithic Europeans adopted agricultural technologies by a process of diffusion of cultural practices with limited migration of farmers from the Near East into Europe. Of course, some genetic mixing is possible with both theories but in demic diffusion, existing populations are mostly replaced while in cultural diffusion there is limited replacement. While the debate continues, genetic evidence seems to support the theory that Mesolithic Europeans were mostly replaced by multiple waves of migration from the Near East and West Asia. Cultural diffusionists believe that y-DNA haplogroup R1b and mtDNA haplogroup H represent the original Paleolithic populations of Europe. Migrationists believe that R1b and H represent Neolithic and Bronze Age immigrants who mostly replaced earlier European populations. DNA testing of remains of ancient remains show that most Mesolithic hunter-gatherers in Europe were mtDNA haplogroups U5 and U4. Other studies have also shown a lack of genetic continuity from European Neolithic farmers to present day Europeans and this suggests that there may have been multiple waves of migration and population replacement. However, because of the limited number of ancient DNA samples and challenges in accurately reading ancient DNA, there continues to be vigorous debate of these theories.
In a 2012 study by Fu et al., Complete Mitochondrial Genomes Reveal Neolithic Expansion into Europe, the authors analyze the dates at which haplogroups U and H underwent population expansions and contractions, and found "a population expansion between 15,000 and 10,000 years before present (YBP) in mtDNA typical for hunters and gatherers, with a decline between 10,000 and 5,000 YBP. These corresponded to an analogous population increase approximately 9,000 YBP for mtDNA typical of early farmers. The observed changes over time suggest that the spread of agriculture in Europe involved the expansion of farming populations into Europe followed by the eventual assimilation of resident hunter-gatherers."
The summary of the U5 subclades presented here is consistent with the conclusions of Fu et al. We see very rapid population expansion beginning with subclades of U5a1, U5a2, U5b1, U5b2 and U5b2 around 15,000 years ago. But what is most interesting is that certain of these subclades have very different patterns of population expansion than others. We have a large number of U5 subclade lineages that date to around 15,000 years ago that are represented by a single sample in the U5 project. In contrast, we have a few relatively young subclades that represent a large percentage of the U5 test results. For example, U5a1a1 with an age of 7,000 years has 94 members that represent 31% of all U5a1; U5a2a1 with an age of 6,000 years has 63 members that represent 37% of all U5a2; and U5b1b1a with an age estimate of 4,000 years has 66 members that represent 52% of all U5b1. (Also, U5b1f represents 65% of all U5 samples in another 2012 study by Behar et al. of the Basque people, however, U5b1f has not yet been reliable dated and could be older than the other large subclades discussed here.) Thus, a large number of U5 test results represent young subclades that began very rapid population expansion during the Neolithic. Several interesting questions remain: were these young, large U5 subclades part of Mesolithic populations who were adopted into groups of Neolithic immigrant communities as they began to expand into southeastern Europe? Were the older and more rare U5 subclades part of Mesolithic communities on the fringe of western or northern Europe who adopted Neolithic technologies at a much later date? Or are these differences the result of population drift or selection? Can the analysis of the 2012 Fu et al. study be repeated with exclusion of U5a1a1, U5a2a1 and U5b1b1a to see if this reveals a stronger signal of population differences between Mesolithic U5 and Neolithic immigrants? To what extent does over sampling of certain populations affect these result? For example, it seem likely that we have a much higher sample frequency from people of northwest European ancestry compared to other parts of Europe, Asia and Africa.
The Last Glacial Maximum
I adapted the following description of the LGM and Younger Dryas from the Oak Ridge National Laboratory's "A quick background to the last ice age":
After about 30,000 years ago, the Earth's climate system entered another big freeze-up; temperatures fell, deserts expanded and ice sheets spread across the northern latitudes. This cold and arid phase which reached its most extreme point sometime around 21,000-17,000 years ago is known as the Late Glacial Cold Stage. The point at which the global ice extent was at its greatest, about 21,000 years ago is known as the Last Glacial Maximum. The Last Glacial Maximum was much more arid than present almost everywhere, with desert and semi-desert occupying huge areas of the continents and forests shrunk back into refugia. But in fact, the greatest global aridity (rather than ice extent) may have been reached slightly after the Last Glacial Maximum, somewhere during the interval 19,000-17,000 years ago.
Warming, then a cold snap: Around 14,000 years ago, there was a rapid global warming and moistening of climates, perhaps occurring within the space of only a few years or decades. Conditions in many mid-latitude areas appear to have been about as warm as they are today, although many other areas - whilst warmer than during the Late Glacial Cold Stage - seem to have remained slightly cooler than at present. Forests began to spread back, and the ice sheets began to retreat. However, after a few thousand years of recovery, the Earth was suddenly plunged back into a new and very short-lived ice age known as the Younger Dryas. Although the Younger Dryas did not affect everywhere in the world, it destroyed the returning forests in the north and led to a brief resurgence of the ice sheets. The main cooling event that marks the beginning of the Younger Dryas seems to have occurred within less than 100 years, according to Greenland ice core data. After about 1,300 years of cold and aridity, the Younger Dryas seems to have ended in the space of only a few decades when conditions became as warm as they are today .
List of Ancient mtDNA U5 Samples
Updated 1//1/2023
Subclade, age (approximate years before present), Lead Author, GenBank ID other ID, Culture/Location: list of extra mutations
U5, 31,155 ybp, Fu, Dolni Vestonice 15, Czech: no extras
U5, 31,155 ybp, Posth, Dolni Vestonice 14, Czech: no extras
U5, 29,977 ybp, Posth, DolniVestonice43, Czech: T16231C, C16519T
U5, 29,977 ybp, Posth, DolniVestonice16, Czech: G1462A, C16519T
U5, 26,662, ybp, Posth, Goyet2878-21: T3202C, C3612T, C13272T, A13299G, T16192C!, C16519T
U5a1*, 5600-4850 ybp, Lipsom, ID?, Baden_LCA/Hungary: G13889A
U5a1*, 8550 ybp, Loosdrecht, UZZ82 Mesolithic II Castelnovian/Sicily: T1007C, 3865G, 9380A
U5a1*, 7200 ybp, Haak, DEB36, Linear Pottery Culture/Germany: T1007C, 150, 16093
U5a1*L, 7700 ybp, Haak, Motala 1, Sweden: G5460A
U5a1*L, 7700 ybp, Haak, Motala 1, Sweden: G5460A
U5a1*L, Roman era, Emery, MG773617: G5460A, T195C, G6267A, A13651G, G5237A, T16093C
U5a1a1, 5100 ybp, Haak, SVP50, Samarra, Russia/Yamnaya: basal, no extras
U5a1a1, 5100 ybp, Haak, SVP52, Samarra, Russia/Yamnaya: basal, no extras
U5a1a1h, 3940 ybp, Nikitin, West Pontic-Caspian: 16192, 16296
U5a1c, 5705 ybp, Lipsom, GEN63, Protoboleraz_LCA/Hungary: no extras
U5a1c, 7500 ybp, Marchii, Asp6, Early Neolithic LBK, Austria: no extras
U5a1c*4a, Iron Age, Chylenski, KX977313, Scythian/location?: T152C, A6752G, T7080C, A14274G, A16180G, C6T, C3571T
U5a1c1a, 4100 ybp, Günther, ATP20, El Portalón cave, Spain: G7013A, A11914G, C16519T
U5a1c2, 2900 ybp, Olalde, I3130, Scotland: T7080C, T11770C, G9452A, C11288T, C16355T, T16359C
U5a1d, 7600 ybp, Haak, SVP44, Samara, Russia: 16241C
U5a1d2, 1560 ybp, Dryomov, Kurgan 2, Sayan Mountain, Minusinskaya, Tepsei III: 4215, 14053
U5a1d2b, 2300 ybp, Pilipenko, Scytho-Siberian Pazyryk, Mongolia: T16086C, T16189C
U5a1d2b, 2300 ybp, Pilipenko, Scytho-Siberian Pazyryk, Altai: no extras
U5a1i1a, 3800 ybp, Haak, ESP3, Unetice Culture, Germany: no extras
Updates on the Project
April-June 2011
The three U5 projects are reorganized to collaborate. All U5 members will be part of the main U5 project, and people who tested the full mtDNA sequence can also join the U5a or U5b FMS projects.
Earlier Project Updates
4 Sept 2010
We have found 3 new daughter groups of U5a2 among the U5 project members. Here are the current results of all U5a2 including both the U5 project and all full mtDNA sequences published in GenBank:
U5a2a has 44 members and is widely distributed throughout Europe
U5a2b has 37 members with a dominant daughter of 28 members that is mostly located in central/eastern Europe
U5a2c has 12 members and is mostly located mostly in western Europe
U5a2d has 2 members with one of them in Finland
U5a2e has 3 members in eastern Europe (Slovenia, Belarus, Czech)
The 3 new unnamed daughter groups of U5a2 have origins in France, Portugal and England
11 Aug 2010
We currently have 840 U5 members in the project including 441 in U5a, 397 in U5b, and 2 not assigned to a U5a or U5b. U5a is mostly U5a1 with 312 members and there are 129 members in U5a2. U5b subgroups are more difficult to predict, but there are currently 70 members in U5b1, 90 members in U5b2, and 24 members in Ub3. Another 56 people are in the U5a project including 47 U5a and 9 U5b. Combining both projects we have 896 U5 test results. Approximately 200 U5 people have done the full mtDNA sequence test (FGS or FMS).
15 July 2010
Jørgen , Gail and Andreas have joined as co-administrators for the U5 project to organize the results and update the trees for U5a1, U5a2 and U5b, respectively. All project members have been placed in groups based on results of the Full mtDNA sequence or predicted groups based on HVR1 and HVR2 results.