Search This Blog


Saturday, April 21, 2012

So who's the most European of us all?

Basically, the first map below reveals the answer. It shows the spread of a European specific cluster from a global-wide ADMIXTURE analysis at K=8 (eight ancestral populations assumed), which I call "North European". Thus, genetically, the most European populations are found around the Baltic Sea, and in particular in the East Baltic region. In my genome collection, samples from Lithuania clearly and consistently score the highest percentages in ADMIXTURE clusters specific to Europe. However, I suspect that if I had Latvians with no known foreign ancestry going back more than four generations, they'd come out the "most European". Hopefully we can test that in the near future.

Below are the fifteen Eurogenes samples that scored the highest percentage levels of membership in the North European cluster. The list only includes groups with five or more individuals present in the analysis, so some populations, like Estonians or Danes, weren't included, even though they easily made the cut. The spreadsheet with all the results from this run can be seen here. A table of Fst (genetic) distances between the eight clusters is available here.

Lithuanians 77%
Finns 74%
Belorussians 70%
Swedes 69%
Norwegians 68%
Kargopol Russians 68%
Russians 68%
Poles 68%
Erzya 66%
Ukrainians 66%
Moksha 66%
Orcadians 63%
HapMap Utah Americans (CEU) 63%
Irish 63%
British 62%

So why did I pick the results from K=8, and not some other K, like 2, 10, or 25? Well, it's not possible to evaluate who is more European without a European-specific cluster (ie. modal in Europeans, with a low frequency outside of Europe). Provided that a decent number and range of global and West Eurasian samples are used in the analysis, such clusters begin appearing at around K=5 or K=6, and start breaking up into local clusters from about K=9. I found that runs below K=8 produced European clusters that spilled too generously outside of the borders of Europe. On the other hand, runs above K=8 produced European clusters that weren't representative of enough European groups (ie. too localized). But the European cluster from K=8 was pretty much perfect, and I think that's obvious from the map. In fact, I can hardly believe how well it fits the modern geographic concept of Europe - north of the Mediterranean and west of the Urals. Amazing stuff.

There are two other clusters that show up across Europe in non-trivial amounts - Mediterranean and Caucasus (see maps below). These can also be thought of as native European clusters, since they've been on the continent for thousands of years. However, their peak frequencies are found in West Asia, so they're not particularly useful signals of European-specific ancestry.

So what do these three clusters show exactly? They represent certain allele frequencies in modern populations, and in fact, these can change fairly rapidly due to admixture, selection, and genetic drift. So claiming that such clusters represent pure ancient populations is unlikely to be true in most cases, if ever. However, I don't think there's anything wrong in saying that, when robust enough, they can be thought of as signals of ancestry from relatively distinct ancestral groups.

Indeed, anyone who's read up on the prehistory of Europe, knows that there are three general Neolithic archeological waves to consider when trying to untangle the story of the peopling of Europe. These are Mediterranean Neolithic, Anatolian Neolithic and Forest Neolithic (for example, see here).

Mediterranean Neolithic refers to a series of migrations from West Asia via the Mediterranean and its coasts. The areas most profoundly affected by these movements include the islands of Sardinia and Corsica, and the Southwest European mainland. Anatolian Neolithic describes migrations into Europe from modern day Turkey, mostly into the Balkans, but also as far as Germany and France. At the moment, Forest Neolithic of Northeastern Europe is something of a mystery. However, the general opinion is that it was largely the result of native Mesolithic hunter-gatherers adopting agriculture.

Obviously, it's very difficult to dismiss the correlations between these three broad archeological groups and the European and two European/West Asian clusters produced in my K=8 ADMIXTURE analysis. Is it a coincidence that the Mediterranean cluster today peaks in Sardinia, which has been largely shielded from foreign admixture since the Neolithic, and today forms a very distinct Southern European isolate? Why does the North European cluster show the highest peaks in classic Forest Neolithic territory? And why does the Caucasus cluster radiate in Europe from the southeast, which is where Anatolian farmers had the greatest impact? These can't all be coincidences, and I'm willing to bet that none of them are. I'm convinced that the three clusters from my K=8 run are strong signals from the Neolithic, and the North European cluster also from the Mesolithic.

Eventually, these issues will be settled with ancient DNA data, in a much more comprehensive way than ever possible using modern genomes. We've already seen some preliminary results, mostly from Mesolithic, Neolithic and Bronze Age sites around Europe, so perhaps it's useful to ask whether my ADMIXTURE analysis and commentary here mirror these early findings? I think they do. For instance, here's an interesting conclusion regarding the East Baltic area from a study on ancient Scandinavian mtDNA by Malmström et al.

Through analysis of DNA extracted from ancient Scandinavian human remains, we show that people of the Pitted Ware culture were not the direct ancestors of modern Scandinavians (including the Saami people of northern Scandinavia) but are more closely related to contemporary populations of the eastern Baltic region. Our findings support hypotheses arising from archaeological analyses that propose a Neolithic or post-Neolithic population replacement in Scandinavia [7]. Furthermore, our data are consistent with the view that the eastern Baltic represents a genetic refugia for some of the European hunter-gatherer populations.

I suppose there will be people wondering why I didn't take Sub-Saharan African, East Asian, and South Asian admixtures into account in my analysis. The reason is that I wasn't looking at which group was most West Eurasian, or Caucasoid. Based on everything I've seen to date, in my own work as well as elsewhere, the most West Eurasian group would probably be the French Basques from the HGDP. However, the differences between them, and certain groups from Northeastern Europe, like Northern Poles and Lithuanians, really wouldn't be that great anyway. I might do a write up about that at some point.


- Maps by Eurogenes project member FR7

- Additional stats by Eurogenes project member DESEUK1


Helena Malmström et al., Ancient DNA Reveals Lack of Continuity between Neolithic Hunter-Gatherers and Contemporary Scandinavians, Current Biology, 24 September 2009, doi:10.1016/j.cub.2009.09.017

Noreen von Cramon-Taubadel and Ron Pinhasi, Craniometric data support a mosaic model of demic and cultural Neolithic diffusion to outlying regions of Europe, Proc. R. Soc. B published online 23 February 2011, doi: 10.1098/rspb.2010.2678


jackson_montgomery_devoni said...

What is the main difference between the North European hunter-gatherer (+ Neolithic admixture) component in this K=8 run compared to the Baltic hunter-gatherer component in the K=12 run? Which one do you think is closest to the original hunter-gatherer population of Northern Europe?

Davidski said...

^ It looks like basically the same component. But it seems the Mediterranean got sort of broken up into another Mediterranean and Southwest Asian (which looks like Mediterranean + something East African).

jackson_montgomery_devoni said...

Do you mean that the Mediterranean in the K=12 run is broken up into ''Mediterranean farmer'', ''Anatolian farmer'' and ''Middle Eastern herder'' components?

princenuadha said...

I didn't see any evidence that the north European component (peaking in NE Europe) should be considered a mix of h/g and neo.

The cranial study you link, said that neos did not penetrate the area where "north European" peaks. The one connection you make between "north European" and neolithic farmers is by connecting the androv people to the funnel beaker... Which is indirect and relies on a few assumptions that aren't a given.

Anyways, great post!

Brandon said...

What happened to Latvians? Did they get mixed? And what geographic features might have isolated this strain even more than Lithuania? I looked at a physical map. Perhaps that peninsular "bump" west of the Gulf of Riga would make a good refuge area for early proto-Nordids.

Davidski said...

^ Nothing happened to the Latvians, I just don't have any, apart from one who's 1/4 Russian. She's close to the top of the European list, but I need at least 5 people from each ethnic group to provide meaningful stats.

And I think the reason the East Baltic, and the East European Plain in general, was a refuge for proto-Europeans, was due to the heavily forested terrain and fairly poor climate. The early farmers headed straight for Western Europe, because they couldn't do much near the Baltic. On the other hand, the later Asiatic steppe invaders headed straight for the Hungarian Plain via Southern Ukraine, because they couldn't feed their horses in the forests.

The only really foreign influence in this zone was Uralic, from the Volga-Ural, and that did bring some Siberian admix to the North and East parts of the East European Plain. That's because these people didn't mind the cold nor the forests.

maxillamaximus said...

I suspect Dienekes is not going to be very happy that you chose the K=8 analysis to represent "the most European of us all", as the Greeks don't fare too well on this plot. I however like it a lot. Great work! Thumbs up. :- )

Esker1970 said...

Hi David,

I am sample PT10. Can you include it on your K8 database (Maritime Neolithic vs. Northern European Hunter Gather + Neolithic Admixture)? Thank you very much.

Joao Bessa Santos

princenuadha said...


He is still our euro brother : )


How much of the "Mediterranean element" in Europe do you think is made up of mesolithic Europeans compared to maritime neolithic farmers?

Davidski said...

^ I have no way to estimate that at present. If I had to guess, it'd be something like 10-15%.

Davidski said...


I will, but I've just put together a very similar test which will be available at Gedmatch from tomorrow (or later today). So use that for the time being.

mindaugas said...

I actually don't believe that Latvians would have more European genes for one simple reason: Latvians and Estonians were heavily colonised by Germans, later Swedes, for some part by Lithuanian-Polish, and for the past centuries - Russians. Therefore I guess their genetics might be much more mixed up than for Lithuanian ones who were famous about their fierce defence against western and eastern colonisers, and they were the last pagans in Europe which shows how closed the communities lasted till 14th century.