Search This Blog

Saturday, December 28, 2013

EEF-WHG-ANE test for Europeans

This test attempts to fit you to the three inferred prehistoric European populations as described in this recent preprint. The relevant Excel file can be downloaded here, and all you have to do is stick your Eurogenes K13 results into the fields provided to get the EEF-WHG-ANE ancestry proportions. A modified version for Near Eastern and Southeast European users can be accessed here.

The test is based on correlations between the average levels of the Eurogenes K13 and the ancient components among selected European populations. Below is a brief description of each of the ancient components.

Early European Farmer (EEF): apparently this is a hybrid component, the result of mixture between "Basal Eurasians" and a WHG-like population possibly from the Balkans. It's based on a 7500 year old Linearbandkeramik (LBK) sample from Stuttgart, Germany, but today peaks at just over 80% among Sardinians.

West European Hunter-Gatherer (WHG): this ancestral component is based on an 8,000 year old forager from the Loschbour rock shelter in Luxembourg, who belonged to Y-chromosome haplogroup I2a1b. However, today the WHG component peaks among Estonians and Lithuanians, in the East Baltic region, at almost 50%.

Ancient North Eurasian (ANE): this is the twist in the tale, a component based on a 24,000 year old Upper Paleolithic forager from South Central Siberia, belonging to Y-DNA R*, and known as Mal'ta boy or MA-1. This component was very likely present in Southern Scandinavia since at least the Mesolithic, but only seems to have reached Western Europe after the Neolithic. At some point it also spread into the Americas. In Europe today it peaks among Estonians at just over 18%, and, intriguingly, reaches a similar level among Scots. However, numbers weren't given in the paper for Finns, Russians and Mordovians, who, according to one of the maps, also carry very high ANE, but their results are confounded by more recent Siberian (ENA) admixture.

It's important to note that this test is only likely to be accurate for people of European ancestry, and indeed only those who aren't outliers from the main European clines of genetic diversity. For details of what that means, please consult the aforementioned paper. However, roughly speaking, if you're of European origin and don't score more than 3% East Asian, Siberian, Amerindian, South Asian, Oceanian, Northeast African and/or Sub-Saharan admixture, then you should get a coherent result. Users from the Near East and Caucasus should run the version specifically designed for them, while those from Southeastern Europe might find it useful to run both calculators and then compare the results.

Thanks to project member DESUK1 for putting this together at such short notice, and MfA for the modified version. Please post your results in the comments section below and state your ancestry when you do. This will help us to improve the accuracy of the test. My results make perfect sense, considering my Polish ancestry.

EEF 42.012706
WHG 40.52702615
ANE 17.46026785

This is my interpretation of who these components represent. Of course, this model might change when more ancient genomes are analyzed.

WHG and WHG/ANE: indigenous European hunter-gatherers
EEF: mixed European/Near Eastern Neolithic farmers
ANE/WHG: Proto-Indo-European invaders from the Eastern European steppe
ENA/ANE: early Uralics from the Volga-Ural region
EEF/WHG/ANE: late Indo-Europeans (ie. Celts, Germanics and Slavs)


Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, bioRxiv, Posted December 23, 2013, doi: 10.1101/001552

See also...

Ancient human genomes suggest (more than) three ancestral populations for present-day Europeans

Ancient North Eurasian (ANE) levels across Asia

Thursday, November 21, 2013

Updated Eurogenes K13 now at GEDmatch

The new K13 population averages and genetic (Fst) distances between the inferred ancestral clusters are available here and here, respectively. To find this test at GEDmatch do this:

GEDmatch > Ad-Mix Utilities > Eurogenes > K13

Below is a 2D PCA based on the average K13 results of the European and Asian reference populations, courtesy of project member PL16.

I now have four tests at GEDmatch with Oracles: the Jtest, EUtest, K15 and K13. It's useful to keep in mind that these tests will differ in their interpretation of the data, and perhaps accuracy, depending on the ancestry of the user. For instance, the new K13 should be more useful for Central and South Asians than any of the others, because it features new reference samples relevant to them.

Monday, October 7, 2013

Eurogenes K15 now at GEDmatch

This new test is essentially an upgraded version of the EUtest. Unlike the original, it includes an Amerindian component and five native reference populations from North and Central America. So obviously it should be a lot more useful for users from the New World who are wondering about Amerindian admixture.

GEDmatch > Ad-Mix Utilities > Eurogenes > Eurogenes EUtestV2 K15

I just tried it myself, and have say that the 4-Ancestors Oracle results were impressive. In other words, they were very accurate based on what I know about my recent ancestry. On the other hand, I'd say the default Oracle was picking up more ancient gene flows. However, this might not be the case for everyone, so let's hear some feedback, discuss the outcomes, and perhaps tweak the settings if necessary.

One of the most important things to keep in mind is to ignore all results under 1%. These are likely to be noise.

The population averages and Fst distances between the ancestral clusters are here and here, respectively. Below are spatial maps of the main West Eurasian components courtesy of Gui (FR7): Baltic, North Sea, Atlantic, East Euro, West Med, East Med, West Asian.

See also...

Orcadians, the K15 and the calculator effect

Saturday, March 9, 2013

Eurogenes K36 now at GEDmatch

I've just put together a new test for GEDmatch called the Eurogenes K36. Obviously, the K36 means that it features thirty six ancestral clusters. It probably won't include any Oracles, mostly because the Calculator Effect would render these useless if they were based on the average results of the reference samples, and it'd be very time consuming for me to test a wide variety of other samples in supervised mode using thirty six sets of allele frequencies.

The main purpose of the Eurogenes K36 is to help users unravel the ethnic origins of local areas of their genomes (aka. half-segments), hence the high number of ancestral categories, some of which are very specific. In other words, the test is mainly a chromosome painting utility. It's accessible via the GEDmatch Ad-Mix link below:

GEDmatch > Ad-Mix page > Eurogenes > Eurogenes K36

An important point to keep in mind is not to take the ancestry proportions too literally. If you're, say, English, and you get an Iberian score of 12% this doesn't actually mean you have recent ancestry from Spain or Portugal. What it means is that 12% of your alleles look typical of the reference samples classified as Iberian, and this figure might only indicate recent Iberian admixture if it's clearly higher than those of other English users.

Another way to look at it is that the ancestry proportions are like map coordinates, and they'll place you with a very high degree of accuracy on a genetic map featuring other users. Indeed, please feel free to post your scores and ancestry details in the comments below to help others get an idea of what their results might represent. My results are listed below. The scores put me squarely in Poland relative to those of other European samples I've run, which is correct.

Also worth mentioning is that this test focuses on much deeper ancestry than the Ancestry Composition at 23andMe. Hence, I expect that many Europeans will score a few percent in non-European clusters. However, like many ADMIXTURE results, this could give us strong hints about population movements into Europe during prehistory and early history, so it's worth keeping an eye on.