Wednesday, March 7, 2012

Southwest Eurasians + Northwest Eurasians + Mesolithic survivors = modern Europeans

Update 23/13/2013: Things become much clearer thanks to the latest batch of ancient genomes. But mysteries remain. See here.


For a long time, it was generally accepted that Europeans were direct descendants of Palaeolithic settlers of the continent, with some Middle Eastern ancestry in the Mediterranean regions, courtesy of Neolithic farmers. However, in the last few years, largely thanks to ancient DNA results, it dawned on most people that such a scenario was unrealistic. It now seems that Europe was populated after the Ice Age in a big way, by multiple waves of migrants from almost all directions, but especially from the southeast.

Getting to grips with the finer details of the peopling of Europe is going to be a difficult and painstaking process, and will require ancient DNA technology that probably isn’t even available at the moment. However, the mystery about the basic origins and genetic structure of Europeans was solved for me this week, after I completed a series of ADMIXTURE runs focusing on West Eurasia (see
K=10, K=11, K=12, and K=13). The map below, produced by one of my project members, surmises very nicely the most pertinent information from those runs (thanks FR7!). It shows the relative spread of three key genetic clusters, from the K=13, in a wide range of populations from Europe, North Africa, and West, Central and South Asia (i.e. the data represents the nature of West Eurasian alleles in the sampled groups, with only three clusters considered). The yellow cluster is best described as Mediterranean or Southwest Eurasian, while the cyan and magenta, which are sister clades, as Northwest Eurasian.

Thus, it appears as if modern Europeans are made up of two major Neolithic groups, which are related, but at some point became distinct enough to leave persistent signals of that split. They spread into different parts of Western Asia before moving into Europe. The Southwest Eurasians, possibly from the southern Levant, dominated the Mediterranean Basin, including North Africa, Southern Europe, and the Arabian Peninsula. I’m pretty sure that Otzi the Iceman is the best known representative of the ancient Southwest Eurasians (see here).

The Northwest Eurasians might have originated in the northern Levant, but that’s a pure guess. In fact, judging by the map above, their influence isn’t particularly strong in that part of the world today, and only becomes noticeable several hundred kilometers to the north and east, in the North Caucasus and Iran respectively. However, the northern Levant is actually dominated by a fourth West Eurasian cluster, tagged by me as "Caucasus" in the K=13 run, and not shown on the map above. Various calculations show that this can also be assigned to the Northwest Eurasian group, except that it seems to have split from the other Northwest Eurasian components at an early stage (see comments section below).

After their initial spread, it appears as if the Northwest Eurasians inhaled varying amounts of native Mesolithic groups in their newly acquired territories west, north and east of the Levant. This is being strongly suggested by the aforementioned ancient DNA results, at least as far as Europe is concerned. They also mixed heavily with Southwest Eurasians in Europe and nearby. That’s why, for instance, you’ll never find an Irishman who clusters closer genetically to an Indian than to other Europeans. However, even a basic analysis of their DNA, like my own ADMIXTURE runs, shows that a large subset of their genomes comes from the same, relatively recent, “Northwest Eurasian” source.

We can follow the same logic when talking about the differentiation between modern descendants of Southwest Eurasians. For instance, those in Iberia have significant admixture from Northwest Eurasians, while those in North Africa carry appreciable amounts of West and East African influence.

I’m convinced that the scenario of the peopling of Europe outlined above, by two basic stocks of migrants from Neolithic West Asia, is the only plausible one, because the signals from the data are just too strong to argue against it. I’m sure you’ll be seeing the same story told by scientists over the next few years in peer reviewed papers. They’ll probably come up with different monikers for the Southwest and Northwest Eurasians, but the general concepts will be the same.

However, that was the easy part. The hard part is linking the myriad of movements of these Southwest and Northwest Eurasians with archaeological and linguistic groups. Perhaps the earliest Southwest Eurasians into Europe were Afro-Asiatic speakers? To be honest, I have no idea, because that’s not an area I’ve studied closely. But I would say that it’s almost certain that the proto-Indo-Europeans were of Northwest Eurasian stock. It’s an obvious conclusion, due to the trivial to non-existent amounts of Southwest Eurasian influence in regions associated with the early Indo-Europeans, like Eastern Europe and Central Asia.

Perhaps the simplest and most diplomatic thing to do for the time being, would be to associate the entire Northwest Eurasian group with an early (Neolithic) spread of Indo-European languages from somewhere on the border between West Asia and Europe? I know that would work for a lot of people, specifically those who’d like to see an Indo-European urhemait in Asia, as opposed to Europe. But it wouldn’t work for me, especially not after taking a closer look at that map above.

As already mentioned, the Northwest Eurasians can be reliably split into two clusters, marked on the map in cyan and magenta. I call the cyan cluster North Atlantic, because it peaks in the Irish and other Atlantic fringe groups, and the magenta Baltic, because it shows the highest frequencies in Lithuanians and nearby populations. The story suggested by the map is pretty awesome, with the Baltic cluster seemingly exploding from somewhere in the middle of the Northwest Eurasian range, and pushing its close relatives to the peripheries of that range. Thus, under such a dramatic model, the North Atlantic is essentially the remnant of the pre-Baltic Northwest Eurasians, and appears to have found refuge in Western and Northwestern Europe, in the valleys of the Caucasus Mountains, and in South Asia.

Indeed, there seems to be a correlation between the highest relative frequencies of the North Atlantic and regions that are still home to non-Indo-European speakers, or were known to have been home to such groups in historic times. For instance, France has the Basques, while the British Isles had the Picts, who are hypothesized to be of non-Indo-European stock. Note also the native, non-Indo European speakers in the Caucasus, like the Chechens, who show extreme relative frequencies of the North Atlantic component. Moreover, at the south-eastern end of the Northwest Eurasian range, in India, there are still many groups of Dravidian speakers.

Below are two maps that isolate the relative frequencies of the North Atlantic (cyan) and Baltic (magenta) components, versus each other and the Southwest Eurasian cluster, to better show the hole in the distribution of the North Atlantic. To be sure, this North Atlantic can be broken down further, but only with more a comprehensive sampling strategy, especially of Northern and Western Europe.

That’s my take on what the data is showing, and other explanations are possible. But I don’t really know what they might be? I should also mention that the potentially proto-Indo-European Baltic cluster shows a remarkable correlation with the spread of Y-chromosome haplogroup R1a, and ancient DNA rich in this haplogroup from supposed early Indo-Europeans. For more info on that, see the links below:

Razib said...

who are hypothesized to be of non-Indo-European stock

re: the picts, this is a debated issue. some people assert that they were brythonic celts. the sample size of the pictish language in words is supposedly small (so you could have a model of celtic with a pre-celtic non-indo-european substrate).

Fanty said...

An interestng sidenote on how Picts actually looked like:

Tacitus (Roman Historian from the 1th century AD), suggests the Picts could originate in Germany. Because they look exactly like Germans to him. He doesnt say anything about the language however.

This is especially of interest as he aswell talks about British Isles tribes that look like Iberians, with black hair and darker skin tones.

Paul Ó Duḃṫaiġ said...

I agree with Razib. For every theory that hypothesizes that that they are pre-IE there is another one that points to them been either Brythonic or even Gaulish when it comes to Celtic langauge family. Placename and firstnames referenced are clearly P-Celtic in origin (non-Goidelic).

One of features of proto-Goidelic is the shift of "Proto-Celtic" V/W -> F
eg. Veni -> Féni (old Irish) -> Fianna (modern Irish)

A similar shift happened in Brythonic though to "GW". Thence:

Vindos (proto-Celtic) -> Find (old irish) -> Fionn (modern irish)
The cognate in welsh is "Gwen"

Neither of above soundchanges is found in surviving Gaulish corpus. In case of Pictish it could be due to isolation from rest of Brythonic.

An example is the Pictish name Uurguist vs. Fergus (old irish) / Fearghus (modern Irish)

Eduardo Pinto said...

I've used this equation to estimate the time of divergence of the K=13 components and it actually makes your point stronger.

It seems that at one point the West Central Asian, the North Atlantic and Caucasus comprised one of the major neolithic groups, most definitely your Northwest Eurasian group. And then around 11 ky BP it started spliting into those three groups, first the WCA, then the caucasus, followed lastly by the NA.
As for the Baltic it only came to be around 4 ky BP.


M vs. WCA

SWA vs. CA

SWA vs. M

M vs. CA

M vs. NA

M vs. BA

WCA vs. CA

BA vs. CA

NA vs. CA

NA vs. BA

Davidski said...

^ Interesting stuff. Thanks.

Eochaidh said...

From a complete layman... amazingly cool!!

Thanks, David!

Unknown said...

Maybe I am missing something, but I don't see the maps above as evidence of any kind of explosion of a "Baltic cluster" that supposedly pushed the "Atlantic cluster" to the western peripheries of Europe. The cyan color occurs in the most densely populated parts of Europe, and, if the Baltic cluster is supposed to be represented by y haplogroup R1a, we know it made very little impression there.

I think you are reading way too much into a map of autosomal differences and making the wish the father of the thought.

Davidski said...

^ I'm not sure what your argument is?

I agree that the expansion of the people who spread the Baltic cluster, and probably R1a, did not push in a major way into Western and Southern Europe. And yes, this might be because this expansion happened after these areas attainted high population densities.

But in my post I didn’t argue that the Baltic cluster + R1a pushed into Western Europe. I argued that this expansion PUSHED ASIDE its Northwest Eurasian relatives, who then survived in more peripheral areas of the Northwest Eurasian range. The map shows this very well, and I don’t see any better explanations for the data.

Interestingly, Western and Southern Europe were also peripheral regions to the early Indo-Europeans. It’s likely these regions were Indo-Europeanised by a domino effect, with little gene flow from the proto-Indo-European cradle.

Unknown said...

Pushed aside? "Peripheral areas"? Have you been to Eastern Europe and specifically to the steppe? I have. Those aren't exactly the choicest spots in Europe. Are you saying the "Baltic Cluster" pushed aside its Atlantic counterparts to grab Eastern Europe and the steppe, thereby relegating the Atlantic Cluster to Western Europe (i.e., arguably the better half of the continent)?

I don't see your maps as painting any such picture of anyone pushing anyone anywhere. They look like maps of autosomal differences as they currently exist.

Eduardo Pinto said...

David, you should run Oetzi against this K=13 run. I'm dying to know if he has any NA in him.

Davidski said...

^ I'm not sure what the problem is? The data seems to paint a very obvious picture.

A population carrying allele frequencies that created the Baltic cluster expanded from somewhere in Eastern Europe - most probably from near the South-eastern Baltic coast.

This expansion more or less cut in half the old Northwest Eurasian cluster, thereby relegating the North Atlantic allele frequencies to the peripheries of that former cluster.

Obviously, these people who expanded from somewhere in Eastern Europe didn't care too much that they weren't located in the choicest part of the real estate. The reason they expanded was probably because they had some technological advantages, like wheeled transportation and metal, both of which were present in Eastern Europe earlier than in Western Europe.

But the fact that, as you mention, Western Europe had nice soils and higher population densities, slowed these Eastern Europeans in the west, but not in the east, where they pushed across the steppe all the way to the Altai and South Asia.

Everything seems to fit - the archaeology, ancient DNA, modern Y-DNA and mtDNA, and now the autosomal clusters from my ADMIXTURE runs.

Western Europe is indeed on the periphery of the Northwest Eurasian range, as is South Asia, and the Caucasus. That doesn't mean they're crappy areas, in fact it might mean quite the opposite.

I'm sorry, but it seems to me you're not comprehending the concepts covered here too well.

princenuadha said...

This looks very interesting!!! Im hoping you could explain some stuff, though.

"the mystery about the basic origins and genetic structure of Europeans was solved... It shows the relative spread of three key genetic clusters, from the K=13"

Let me get this straight. You picked those three because because they were the most representative of Europe? Is that the only reason (because, incidentally, they actually look pretty interesting for west eurasians as a whole)?

"The yellow is best described as Mediterranean or Southwest Eurasian,"

What is yellow in the K=13, "Mediterranean"?

"The northern Levant is actually dominated by a fourth West Eurasian cluster"

Then it would make a LOT of sense that the Northwest Eurasian cluster did not form south of the caucasus (or the whole of the middle east for that matter).

"those who’d like to see an Indo-European urhemait in Asia, as opposed to Europe."

Actually that becomes pretty trivial seeing how the modern people of Greece and the Middle East have very little "Northwest Eurasian" (NE). So even if NE evolved there, the modern inhabitants aren't connected to them.

The amazing thing you've shown is a break between northern west eurasians and the southern west eurasians in which the whole of the middle east and greece is the southern.

"Thus, under such a dramatic model, the North Atlantic is essentially the remnant of the pre-Baltic Northwest Eurasians"

The map does look like BA spread into NA territory but why would you think BA is an offshoot of NA?

" Indeed, there seems to be a correlation between the highest relative frequencies of the North Atlantic, and regions that are still home to non-Indo-European speakers"

Wait, I'm confused. Are you saying you don't think NA largely spread by indo European? But you also said the entire NE spread with indo European...

If NA spread to western europe and India quite recently the only connection I can find is indo European.

Mark said...

Check out this study. How would this be affected by the data?

Here is a study which shows a global summary of the extent to which current genetic knowledge can explain lactase persistence phenotype frequency.

A worldwide correlation of lactase persistence phenotype and genotypes.
Yuval Itan etal

princenuadha said...

Now that I think about it, LP in Europe is associated with northwest European and thought to have been spread by indo European. I think that fits in with the idea of two closely related indo European source groups. One in the southern steppes, which migrated to NW Europe bringing LP and r1b, and another in the northern forest zone that spread in eastern Europe bringing r1a.