Search This Blog

Monday, September 8, 2014

Eurogenes ANE K7


Update 01/01/2015: ANE is the primary cause of west to east genetic differentiation within West Eurasia.

...

As its name implies, the Eurogenes ANE K7 is specifically designed to estimate Ancient North Eurasian (ANE) ancestry. It's based on a series of supervised runs with the ADMIXTURE software, and freely available at GEDmatch under the Eurogenes Ad-mix tests tab.

The ANE component is not modeled on the Mal'ta boy or MA-1 genome, the main ANE proxy in scientific literature, because this sample didn't offer enough high quality markers for the job. So instead, I used the non-East Asian portions of several Karitiana genomes from the HGDP.

I wasn't sure what was going to come of that, but it actually seems to have worked out really well. Below are the results for several individuals that were not used in the making of the test, and clearly their ANE scores look pretty damn solid going by recent papers. For instance, both Lazaridis et al. and Raghavan et al. estimate the Karitiana Indians at just over 41% ANE (see here and here).

Karitiana_HGDP00998
ANE 41.56%
ASE 0.41%
WHG-UHG 0%
East_Eurasian 58.01%
West_African 0%
East_African 0.01%
ENF 0%

Lezgin_GSM536850
ANE 26.74%
ASE 3.88%
WHG-UHG 14.65%
East_Eurasian 0%
West_African 0.01%
East_African 0%
ENF 54.72%

Bedouin_HGDP00651
ANE 0%
ASE 0%
WHG-UHG 0.05%
East_Eurasian 1.49%
West_African 0%
East_African 8.19%
ENF 90.27%

Sardinian_HGDP01067
ANE 0%
ASE 0%
WHG-UHG 49.49%
East_Eurasian 1.8%
West_African 0.01%
East_African 0.01%
ENF 48.69%

You can also cross-check your ANE score with the results in this spreadsheet and table. The spreadsheet includes ANE estimates for more than 2,000 individuals that I tested with the ADMIXTURE software in supervised mode (see here).

On the other hand, the table comes from the Lazaridis et al. preprint, which I'm sure many of you have read by now several times over. And please pay attention to the range of ANE proportions for each population, rather than just the point estimates.

Obviously, there are also six other ancestral components in this test (hence the K7 in the name). They're basically byproducts of me trying to isolate ANE, and don't necessarily mean anything. Nevertheless, here's a brief rundown of what I think some of them might represent...

Ancestral South Eurasian (ASE): this is a really basal cluster that peaks in tribal groups of Southeast Asia. It's probably very similar in some ways to the Ancestral South Indian (ASI) component described by Reich et al. a few years ago.

Western European/Unknown Hunter-Gatherer (WHG-UHG): this essentially looks like a West Eurasian forager component, and includes the forager-like stuff carried by Neolithic farmers (Oetzi the Iceman has 40% of it).

Early Neolithic Farmer (ENF): I'd say that this is the component of the earliest Neolithic farmers from the Fertile Crescent.

The other three components should be easy to work out from their names. They're almost identical to several components with the same or similar names from my other tests.

Some of you might be wondering why this test doesn't offer an Early European Farmer (EEF) cluster. But the answer to that should be obvious by now. EEF is not a stable ancestral component. It's actually a composite of at least two ancient components, including the so called Basal Eurasian and WHG-UHG. If it really was a genuine ancestral component, like ANE, then I'm pretty sure I'd be able catch it with ADMIXTURE. But I can't.

Indeed, a really important thing to understand about the Lazaridis et al. study is that it doesn't actually attempt to estimate overall WHG-UHG ancestry in Europeans, but rather the excess WHG-UHG on top of what is already present in the EEF proxy Stuttgart.

Also worth noting is that this K7 can be a bit noisy. That's mainly because it's very difficult to correctly assign proportions of ancient ancestry to present-day samples. But like I say above, this test is basically designed to estimate ANE scores. If you're wanting to learn about your overall ancestry then I recommend the Eurogenes K13 and K15 tests.

Missing SNPs might also be an issue for some people. It stands to reason that results will be noisier with more missing markers and no calls.

Have fun and don't forget to make a donation at some point to the Eurogenes cause, via the PayPal tab at the top right of the page. This will help me to keep up with what's going on in the world of Paleogenomics, and continue blogging and running analyses.

Citations...

Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, arXiv, April 2, 2014, arXiv:1312.6639v2

Raghavan et al., Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans, Nature, (2013), Published online 20 November 2013, doi:10.1038/nature12736

See also...

Corded Ware Culture linked to the spread of ANE across Europe