Friday, March 2, 2012

ADMIXTURE analysis of West Eurasia - K=12 run

Moving up to K=12 (twelve ancestral clusters assumed) has resulted in a major shake up of the West Eurasian components. The Western European cluster from the K=11 run has been replaced by North Atlantic and Mediterranean clusters. Interestingly, the new North Atlantic has also eaten up a lot of the putative West Central Asian influence in samples from Northwestern Europe, basically changing the implied direction of gene flow, from east to west to west to east.

What this clearly shows is that it's not a good idea to take the results of a single ADMIXTURE run too literally. It's more appropriate to consider output from a variety of K when formulating theories, and then to cross check this info with results from other analyses using very different algorithms. Anyway, my impression here is that ADMIXTURE is now beginning to struggle with the lack of differentiation between the West Eurasian clusters, and producing quirkier results as well as more noise. I expect that the K=13 run will be an uphill battle, and most likely result in a fairly uninformative cluster somewhere. Hopefully I'll be proved wrong.

Key: Red = East Asian, Orange = Mediterranean, Yellow = Sub-Saharan, Light Green = South Asian, Green = North Asian, Aqua Green = East African, Aqua Blue = Caucasus, Light Blue = North Atlantic, Dark Blue = West Central Asian, Dark Purple = Southwest Asian, Light Purple = Baltic, Dark Pink = Southeast Asian. See spreadsheet for details. An image of the Fst (genetic distances) between the clusters can be found here.


Ted Kandell said...

David, anxiously awaiting your higher-order K results.

I think based on the very early worldwide K=15 results from Dienekes back last year, there will be some sort of optimal set of clusters that look like this, but perhaps with a different set of SNPs at a much higher SNP density:

1. Paleo-African (or separate Bushman, Mbuti Pygmy, and Hadza)
2. Neo-African (West African)
3. East African
4. Northwest African
5. Mediterranean
6. "Atlantic" European
7. Baltic
8. Southwest Asian (Arabian)
9. "Caucasus" (Anatolian Neolithic)
10. West Central Asian
11. South Asian
12. North Asian / Siberian (incl. N. America out to Greenland)
13. "East Asian"
14. Southeast Asian
15. Austronesian (Papuan + Australian Aboriginal)
16. North American (Na-Dene, etc.)
17. Amerindian

A minimum of 17, but the Paleo-African may split up into 3 based on what we see on the various PCA plots that you did. It may be that the Austronesian splits into two between the Papuan and Australian Aboriginals. (If we can actually get some Australian Aboriginal data, that is.) That would make a maximum of K=20 worldwide.

Check your early PCA plots which show clear Paleo-African components and high divergences among the South African San, Mbuti, and Hadza. You'll see some incredibly divergent samples among these three groups that are quite different even from other Africans, at least as much so as typical Africans are from Eurasians.

The key to this is finding an unbiased set of SNP results. Maybe it will be possible for you to try this on ancestry-informative SNPs (not in full LD with others) derived from the 1K Genomes and include SNPs under 1% frequency. Then you could run the same set using the "common SNP array" set being used now and see the difference.

Maybe we are very close to having this "World K=20" even now, so the biggest obstacle is the single Australian Aboriginal full genome and no others from this critically important population.

Hopefully "the dust will settle" shortly, and you'll be able to get a "K=20" that "works", but remember that what may seem to be an "error" may not in fact be one at all, and may have a very valid archeological, paleo-linguistic, or historical explanation.

Ted Kandell said...

Things to be aware of that may not "make sense" on higher order K PCAs but in fact have very valid historical reasons. Many of these worldwide long-distance migrations took place well before the "Age of Exploration" after 1492:

1. Central Asian R1b1c-V88 in West-Central Africa (Cameroon) carried by the Bantu Expansion as far south as Namibia and South Africa. mtDNA M1 and U6 may also represent a similar signal of a back migration.

2. Greenland Inuit clustering with Paleo-Siberian Chukchi (on your Native American-like Finnish segments plot) and Paleo-Greenland Saqqaq Man who was essentially Chukchi in his full genome being a possible sign of a "Trans-Arctic Diffusion" that was continuous and continued until rather recently. This would have left traces in Scandinavia and North America.

3. The mtDNA U6b1a Mande from Senegal who is 2 off at the full mtDNA sequence from a Yakut from Siberia. This is the "Saami-Berber" link.

4. A possible Austronesian component on the East African coast from the Austronesian settlement in Madagascar.

5. The slave trade in exotic women in the Muslim world, which extended from the Philippines (Malays) to the Caucasus (Circassians) and East Africa ("Sidis", Africans in South Asia and "Black Circassians"). "Exotic women" were very high status, and often were the mothers of the primary legitimate heirs. Also, in the Muslim world, "Mamluks" from various places (Central and East Asian Turks, West and East Africans, Caucasians) were often extremely high-status men and rulers of empires. We know that there were both Icelandic slaves and Turks in North Africa, as well as many Circassians and Slavs. Also, think of the power that Valide Sultan "Hurrem" of Ukrainian origin wielded in the Ottoman Empire.The point is, that in the Muslim world, "slavery" involved both low-status and extremely high-status men and women.

6. The Roma and Domari migrations all over the Muslim World and Christian Europe, which brought South and Central Asian ancestry almost everywhere. it isn't always obvious who has this sort of Roma ancestry.

7. A "back migration" of slaves and others from the Americas to Iberia, England, and possibly elsewhere (North Africa, as "exotic" slaves?) after 1492. This includes Mexican troops brought by the Spanish to "Mexico City" in the Philippines.

Ted Kandell said...

8. The Jews. We see low levels of the most exotic sorts of ancestry among Jews - mostly on the mtDNA, not the Y - including West African L2a1l2 and South Chinese M33c1 and N9a3 among Ashkenazim from Belarus. We see a rather high Northwest African component (2-10%) among most Jewish groups everywhere (e.g. even Georgian Jews) and while we don't see Jews with Uralic N1c1, we do even see Ashkenazi Jews with Northwest African E1b1b1b-M183. Much of this has to do with Jews being "panmictic" with each other across the world, and also Jews being religiously and culturally accepting of converts (including freed slaves who were automatically Jews) and local non-Jewish "concubine" women who produced 100% fully-Jewish legitimate children. There is huge misunderstanding of the nature of ancestry among Jews by non-Jews, thinking that a Jewish identity has something to do with physical ancestry. There is also a deep misunderstanding of the historical nature of conversion to Judaism, which mostly involved *individual* conversion and not mass conversion. (Except perhaps in the case of pre-Islamic Arabs and Berbers.) Once Jews admixed locally, Jews who traded or migrated to distant lands would intermarry with the local Jews who were already admixed to a certain degree, and when these left (or were expelled) they would have brought their "exotic" alleles to other far-flung Jewish communities worldwide. Southeast Asian (and Austronesian!) ancestry among Ashkenazim has *nothing* to do with "Khazars" (since we don't see it in anyone else west of South China) but has *everything* to do with Jews from Western Eurasia (Narbonne) trading with the thousands of Jews in Tang Dynasty China before 843. Among some Ashkenazim from Belarus we see low-level non-zero components at 11/12 or 8/9 of "world9". Not Paleo-African or Amerindian, but pretty much something from all the rest.

Ted Kandell said...

9. In turn, the Jews, because of the expulsions and persecutions and later forced conversions, were the ancestors of millions of other people. We see very close Y matches between Ashkenazi Jews from Vilnius, New Mexican Hispanics, and "Chamorros" from Guam with the seemingly identical surnames "Tainovich" and "Tenorio". We know that a Don Bartholome Thenorio migrated (fled!) to Manila in 1598 and his descendants then went to Saipan and Guam by about 1700. "One step ahead of the Inquisition." Both seem to be somewhat more distant matches to Sephardic Jews named "Alfasi" (Al-Fasi, from Fez, Morocco). We see the same unusual worldwide distributions of Y clusters all across the Y tree, where various sorts of admixed people from the Spanish and Portuguese Empires worldwide share close relationships on the Y (and sometimes on the mtDNA as well) with Ashkenazim.
My own very closest matches in the huge "Ashkenazi" G2c-M377 DYS425=null cluster are in fact not Ashkenazi at all, but "Mestizos [Admixed]" from Merida, Yucatan, Mexico.
I carry one copy of "ultra-Archaic" HLA-B*27. This same HLA-B*27 is found among Mestizos in Mexico: "Semtic genes". ;)
In turn, Ashkenazim are the ancestors of tens of thousands of Catholic "Poles" (including Polish Catholics from Western Ukraine) Frankists who converted mostly around 1760, and also Jews are the ancestors of many Muslim Turks (including a group of Ashkenazim as well.) In that case, it would not be so farfetched to see shared segments between Turks, Poles, "Mayans", and Chamorros from Guam.
My guess is that we would also see these wherever "Portuguese" settled, including in Sulawesi, East Timor, Macau, Goa and Sri Lanka, and maybe Mozambique and Angola, not just the New World which had thousands of such "Portuguese" (a euphemism for Crypto-Jews) in Mexico, Columbia, and Peru as well as Brazil. All these people seem to be well aware of this, including people from Sulawesi, the Philippines, Sri Lanka, Turkey, Poland, and all over Latin America. There was even a rather massive back-migration from the Americas when Recife was conquered by the Portuguese Empire from the Dutch in 1654, when 5000 Jews barely escaped with their lives. (Or were deliberately let go?) We know many other Crypto-Jews from the Americas fled to Amsterdam and Salonika and were burned in effigy. There's no saying that these did not have local admixture, including Amerindian admixture. The Jews in general were not obsessed by "Race" in the way that the "blue-blooded Spaniards" were. "Race" wasn't a religious concept in Judaism, and anyway, Jews didn't qualify as "pure" under "Limpieza de Sangre" just as much as Native Americans and Africans didn't either.

10. Of course, the low-level African and Native American ancestry in England, Iberia, and elsewhere. I suspect that the Native American admixture might also be found in North Africa, where there was a very active trade in exotic slave women. There were also Native American soldiers in "Mexico City" in the Philippines near Manila.

The point is, before you consider what's strange and "makes no sense", think about what's archaeologically and historically possible, or likely, and what we already see from the Y and mtDNA evidence.

Paul Ó Duḃṫaiġ said...

If you ask me it's not "North Atlantic" that's eating up the "West Central Asian" component but Mediterranean. If you look at the combined french sample (HGDP + Eurogenes submissions) you see that the "Western European" component was 53.48% let the "North Atlantic" was 47.77%.

Mediterranean came in at 37.24% in K=12. My feeling is that most of reduction in "Southwest Asian" (-7.91%), "West Central Asian" (-3.38%) and Caucasus (-3.21%) went to this Mediterranean compont. However those only add up to about 14.5% which would imply that the rest of the Med component probably came from "Western European", however it's only down -5.72% (if comparing to "North Atlantic" in K=12)

If you ask me "North Atlantic" absorbed most of what was previously "Baltic" in the French sample. However it lost a big chunk of the previous "Western European" component to "Mediterranean".

One only has to look at the averages for the Irish. Baltic crashes from an average of 29.77% to 2.66% (-27.11 change), let the change in "North Atlantic" versus "Western European" is only 10.45%, this implies to me anyways is that big chunk of Western European ended up in Med component among the Irish. Likewise the Med component caused a big decrease in likes of West Central Asian among the Irish (and others)

Perhaps a better title for "North Atlantic" would be "North-West European", likewise Baltic would be "North-East European"

More then likely as you add extra components you will end up spilting the Mediterranean component into a "West Europe" and "East Med"

pconroy said...


My father has a distant relative from Guatemala who is supposedly 100% Mayan. His daughter tested him and found his Y-DNA was I1, and he has 5 relatives in Ireland and 2 in Scandinavia. So it would seem the connection was with some Viking or Norman influenced Irishman.
One name comes to mind that fits the bill, he was an Irishman from Wexford (heavily Viking/Norman), a spy, a linguist, a polymath, a soldier and the drafter of the FIRST Proclamation of Independence in the Americas, he was none other than William Lamport:

Better known by the acronym, "El Zorro" (the fox)

Gestr said...

I would like to ask something: I made the K12 test, and i have 15% caucasian. What does it mean? What type of genetics (haplogroups?) has the caucasians?