tag:blogger.com,1999:blog-5802855114789517602024-03-13T01:08:13.913-07:00Eurogenes Genetic Ancestry ProjectDavidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comBlogger28125tag:blogger.com,1999:blog-580285511478951760.post-25485067832016139452021-08-09T18:33:00.008-07:002023-11-02T16:46:18.723-07:00Genetic ancestry online store (closed until further notice)</br><b>Please note that the store is closed until further notice.</b> Thank you for your continued support.</br></br>
...</br></br>
</br>Following a rigorous testing phase, the awesome Global 25 analysis is now available at the store for $12 USD. What's so awesome about this test, you might ask? See <a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">here</a> and <a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">here</a>.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://drive.google.com/file/d/1k6z0j1Rt0zSHwQqpJ2OO_xcxX9UX0Zbz/view?usp=sharing" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8XLofDFLYP2H1jLLnrd8jHQmWPbo3cgkhwjFPrtRbqhO7ugVJsmEnYJBkUlsg59fqRKGvFIaq7xZNjbQ86Q_1tgKe8Wp1cuTPSC7pEUUAPiMkZ3bFSTt8lAglc81P1Qdiv0DNPj0cmKA0/s1600/Global_25_graphic_small.png" data-original-width="300" data-original-height="383" /></a></div></br>
Please send your request and autosomal genotype data (from AncestryDNA, FTDNA, LivingDNA, MyHeritage or 23andMe) to <b>eurogenesblog at gmail dot com</b>.</br></br>
However, note that this test is free for anyone who already has Global 10 coordinates (see <a href="https://eurogenes.blogspot.com/2016/10/a-fresh-look-at-global-genetic-diversity.html">here</a>). That's right, if you already have Global 10 coordinates, all you have to do is to send me your data and say what it's for. Simple as that.</br></br>
...</br></br>
My <i>Celtic vs Germanic</i> Principal Component Analysis (PCA) is now available via the store for $6 USD (see <a href="https://eurogenes.blogspot.com/2018/09/celtic-vs-germanic-europe.html">here</a>). Please note that this test is only really useful for people of Central, Northern and/or Western European origin, and indeed geared for those of overwhelmingly Northwestern European ancestry.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyP87G1s1cQO2kTOwWOxkSU6qQmIOA2_wXRL1cnWyvWXSHlpTsSnlLnaeFqeXcePrtKtxUPKX7mpzhQF0psfqx_DOT_3DfZid8W5hA4UAlIHR6GwdCuXW7sQr5L-GhreBrID1d6aZ7SjmS/s1600/Celto-Germanic_PCA_new.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjyP87G1s1cQO2kTOwWOxkSU6qQmIOA2_wXRL1cnWyvWXSHlpTsSnlLnaeFqeXcePrtKtxUPKX7mpzhQF0psfqx_DOT_3DfZid8W5hA4UAlIHR6GwdCuXW7sQr5L-GhreBrID1d6aZ7SjmS/s480/Celto-Germanic_PCA_new.png" data-original-width="1218" data-original-height="665" /></a></div></br>
Please send your request and autosomal genotype data (from AncestryDNA, FTDNA, LivingDNA, MyHeritage or 23andMe) to <b>eurogenesblog at gmail dot com</b>.</br></br>
...</br></br>
The popular Basal-rich K7 admixture test is now available via the store for $6 USD. <b>It's suitable for everyone, except people with significant (>10%) Sub-Saharan ancestry.</b> For more information about this test and some ideas about what to do with the output see <a href="https://eurogenes.blogspot.com/2016/07/sneak-peek-basal-eurasian-k7.html">here</a> and <a href="https://bga101.blogspot.com/2017/05/an-nmonte-and-4mix-guide-for.html">here</a>.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuz8k7mJCG8udhSJD8nEt1YVN5gCdliUWAJVQdW-hlt9F33xx78_NAO0f-QHF8c6ZUi0xemCRRZ7pbDWHcXSnzNUCqtZrNMPbwKaYFkwswQgPgc-rhvV5plg8P64nQYZdQPGOcO-LvsacQ/s1600/K7_triangle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuz8k7mJCG8udhSJD8nEt1YVN5gCdliUWAJVQdW-hlt9F33xx78_NAO0f-QHF8c6ZUi0xemCRRZ7pbDWHcXSnzNUCqtZrNMPbwKaYFkwswQgPgc-rhvV5plg8P64nQYZdQPGOcO-LvsacQ/s450/K7_triangle.png" data-original-width="1218" data-original-height="702" /></a></div></br>
Please send your request and autosomal genotype data (from AncestryDNA, FTDNA, LivingDNA, MyHeritage or 23andMe) to <b>eurogenesblog at gmail dot com</b>.</br></br>
See also...</br></br>
<a href="https://eurogenes.blogspot.com/2018/05/global25-workshop-1-that-classic-west.html">Global25 workshop 1: that classic West Eurasian plot</a></br></br>
<a href="https://eurogenes.blogspot.com/2018/05/global25-workshop-2-intra-european.html">Global25 workshop 2: intra-European variation</a></br></br>
<a href="https://eurogenes.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://eurogenes.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a></br></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-17075325904936803102020-08-06T18:27:00.001-07:002021-08-09T18:35:37.206-07:00New Global25 interpretation tools</br>They're available at <a href="https://yk.github.io/ancestry/">Ancestry Calculator</a> and <a href="https://genoplot.com/">GENOPLOT</a>. Unfortunately, I can't tell you exactly how to get the most out of them. All I can recommend is robust experimentation.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://genoplot.com" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4MESV_pxKYBaB1Q5cqEhS3du6zhwPZ40YOSsQVGEwMtjra8Qc9IE5v6tpAAQIxxwY9iHKxx9M90eON8x1zqXyo2Cyi8A9e9Fx6qoggljn-H2TlNW-WBdH5Ue8rvveRXrMzUG-qwZGC3w/s450/GENOPLOT_screen_cap.jpg" data-original-width="1299" data-original-height="614" /></a></div></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://bga101.blogspot.com/2021/08/genetic-ancestry-online-store.html">Genetic ancestry online store</a></br></br>
<a href="https://bga101.blogspot.com/2020/01/modeling-your-ancestry-has-never-been.html">Modeling your ancestry has never been easier</a></br></br>
Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-53676037980193505342020-01-14T12:44:00.001-08:002020-01-14T12:44:25.539-08:00Modeling your ancestry has never been easier</br>An exceedingly simple, yet feature-packed, online tool ideal for modeling ancestry with <a href="https://bga101.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Global25</a> coordinates is freely available <a href="https://vahaduo.github.io/vahaduo/">HERE</a>. It works offline too, after downloading the web page onto your computer. Just copy paste the coordinates of your choice under the "source" and "target" tabs, and then mess around with the buttons to see what happens. The screen caps below show me doing just that.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW-dyAjTK2Ac4_fFWeUXV5za8H2eUAt8ORr4-46m8iu-7TXR3p46xTOTaeoMlIcV_TYfLOJKO-6v7AZCGGt5jEPYZymub0B44LB0tIOiEtyULI3Jfblv8tTlqO3ua8DfXFJAwvNaXmdJ3T/s1600/V_guide1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW-dyAjTK2Ac4_fFWeUXV5za8H2eUAt8ORr4-46m8iu-7TXR3p46xTOTaeoMlIcV_TYfLOJKO-6v7AZCGGt5jEPYZymub0B44LB0tIOiEtyULI3Jfblv8tTlqO3ua8DfXFJAwvNaXmdJ3T/s420/V_guide1.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXlMLZEv6z4l8YQOSZ1ZYEC7uML8T01yOTiPh26ZlOEi9UQltaLzU-SL8gfwLZYyLhwaLB7vUdfwRkJ5jylI2lfrS5VKbLorqAyILm_35c-MAoV5JrnG5e9OME3DmkAM1Qu-_dw6n9UQDm/s1600/V_guide2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXlMLZEv6z4l8YQOSZ1ZYEC7uML8T01yOTiPh26ZlOEi9UQltaLzU-SL8gfwLZYyLhwaLB7vUdfwRkJ5jylI2lfrS5VKbLorqAyILm_35c-MAoV5JrnG5e9OME3DmkAM1Qu-_dw6n9UQDm/s420/V_guide2.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimEEnB7GRuTHRMMuStuLPPv0nYOoG7qkGeT6PmwZ9HcfhyphenhyphenHfa9bOXP5Np0_JhAxMS7DJoc9kHbeh1XtLmiLkx1ziTi40S6LrXp8HfRsmIT_Uf3zztl_E8D_oyHY-zkyaRGDO8qWBTcQ4ih/s1600/V_guide3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimEEnB7GRuTHRMMuStuLPPv0nYOoG7qkGeT6PmwZ9HcfhyphenhyphenHfa9bOXP5Np0_JhAxMS7DJoc9kHbeh1XtLmiLkx1ziTi40S6LrXp8HfRsmIT_Uf3zztl_E8D_oyHY-zkyaRGDO8qWBTcQ4ih/s420/V_guide3.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYBIz82u2Cv-C6oKQDk-sS4N3rgB3VHmViemVS_0SDMVAw4lylYYEz-2ZseJvl0cAC01RF-_urqdAUv05u3lGaxRxn_bTBge0T02FUjcrK_jlm1ENo99awzc1QFCd7RzseQR88-XFou5AC/s1600/V_guide4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYBIz82u2Cv-C6oKQDk-sS4N3rgB3VHmViemVS_0SDMVAw4lylYYEz-2ZseJvl0cAC01RF-_urqdAUv05u3lGaxRxn_bTBge0T02FUjcrK_jlm1ENo99awzc1QFCd7RzseQR88-XFou5AC/s420/V_guide4.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZTBy1pU0aXS-vHsoDMS22f_Ute3HJAnpjLyXnA7SfEqhGFy27jaGiSD2TBmFCCV0RpI5YEo-ZN1QugdXErpivb125Ub3vY8NQrVHKs61RsdWwDzvAcW5SH0R6N5UgopWtBqluzV5Lq9Ye/s1600/V_guide5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZTBy1pU0aXS-vHsoDMS22f_Ute3HJAnpjLyXnA7SfEqhGFy27jaGiSD2TBmFCCV0RpI5YEo-ZN1QugdXErpivb125Ub3vY8NQrVHKs61RsdWwDzvAcW5SH0R6N5UgopWtBqluzV5Lq9Ye/s420/V_guide5.png" data-original-width="1300" data-original-height="669" /></a></div></br>
Another free, easy to use online tool that works with Global25 coordinates is the Principal Component Analysis (PCA) runner <a href="https://vahaduo.github.io/g25views/#">HERE</a>. Below is a screen cap of me checking out one of the many PCA that it offers.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN8tph3t5ZarRmQqnVILNPBTFxcmdj4eRN2NENtzVQb7-AVX0Ojc0wkgDpYrAIP8AQlGfQAo-QSJy5VHad6FtHuRcaTgfsa2VSbg0BFEjlXrm5V80MMJFzceWx9hsMHYs6pB0fLhT29Ej5/s1600/Vahaduo_g25views.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN8tph3t5ZarRmQqnVILNPBTFxcmdj4eRN2NENtzVQb7-AVX0Ojc0wkgDpYrAIP8AQlGfQAo-QSJy5VHad6FtHuRcaTgfsa2VSbg0BFEjlXrm5V80MMJFzceWx9hsMHYs6pB0fLhT29Ej5/s480/Vahaduo_g25views.png" data-original-width="1300" data-original-height="668" /></a></div></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-53385085511899560452019-12-14T20:47:00.000-08:002019-12-14T23:46:17.773-08:00Avalon vs Valhalla revisited</br>Pictured below is a new version of my Celtic vs Germanic genetic map. It's based on the same Principal Component Analysis (PCA) as the original (which can be seen <a href="https://eurogenes.blogspot.com/2018/09/celtic-vs-germanic-europe.html">here</a>), but more focused on Northwestern Europe and produced with a different program.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPkPIpNn1KMxB6ipTIQ0CoaF3p_bXsNAZXPYBULwMf4baMIct1SfE665-ISNEG1bnPuLSUSNFBSSl4gt1a2HnQjotsylX8FYq06C_Mhx-h87fqyvVwhwxymvvsUT_rTI697wxk7q7oLx-R/s1600/Avalon_vs_Valhalla_screen_cap1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPkPIpNn1KMxB6ipTIQ0CoaF3p_bXsNAZXPYBULwMf4baMIct1SfE665-ISNEG1bnPuLSUSNFBSSl4gt1a2HnQjotsylX8FYq06C_Mhx-h87fqyvVwhwxymvvsUT_rTI697wxk7q7oLx-R/s480/Avalon_vs_Valhalla_screen_cap1.png" data-original-width="1299" data-original-height="669" /></a></div></br>
To see the interactive online version, navigate to <a href="https://vahaduo.github.io/custompca/">Vahaduo Custom PCA</a> and copy paste the text from <a href="https://drive.google.com/open?id=15jg4asZD-cZK2cG_J_hN_vtQoZ7wM0R3">here</a> into the empty space under the PCA DATA tab. Then press the PLOT PCA button under the PCA PLOT tab. For more guidance, refer to the screen caps <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNbwfVY0SAbLrRTT4Y8yFug2x2VSGh5yrsgdaPJy8EBIY-30JGmbyujAi5GUImCdvFEI0eH5KHYE0fIO_b-L67VSb-4M8p7U5sXuaOywZH9evVpPKnS14BaruC_vSy5Vc1g868ZwbVZ6Ee/s1600/Avalon_vs_Valhalla_screen_cap3.png">here</a> and <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGjneiLKKbTzWa1tJFyzuR6R4Irz6ekXhVQTsJ9oon1kYXDfWcFq-uBrBR7OlnVtA-FsSNZJGJvhZ7mDCCod8QLaM077BN2SsZyRpvICdxh0c0q06qg4qPtgdESWhPkAKs1CwL7I-Ap8pr/s1600/Avalon_vs_Valhalla_screen_cap4.png">here</a>.</br></br>
To include a wider range of populations in the key, just edit the data accordingly. For instance, to break up the ancient grouping into more specific populations, delete the <i>Ancient:</i> prefix in all of the relevant rows. This is what you should see:</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0EnuYq6aj3n4BVIRM5d84BQeSiW6K8eVwphJyTWZk2XhOo1XmUo9ZVFPEb51LyatM0CDT9n6WpcJ4SGLAuM9_AZTQrlGn4RgYSSGz5goAE6UilDT8FuoOZRo_rgHRxVh-hiCLhLkIpnVa/s1600/Avalon_vs_Valhalla_screen_cap2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0EnuYq6aj3n4BVIRM5d84BQeSiW6K8eVwphJyTWZk2XhOo1XmUo9ZVFPEb51LyatM0CDT9n6WpcJ4SGLAuM9_AZTQrlGn4RgYSSGz5goAE6UilDT8FuoOZRo_rgHRxVh-hiCLhLkIpnVa/s480/Avalon_vs_Valhalla_screen_cap2.png" data-original-width="1299" data-original-height="669" /></a></div></br>
Conversely, you can leave the ancient sample set intact and instead reorder the present-day linguistic groupings into, say, geographic groupings. To achieve this just delete all of the linguistic prefixes, such as <i>Celtic:</i>, <i>Germanic:</i>, and so on. You should end up with a datasheet like <a href="https://drive.google.com/open?id=1HijZbNXeJjkMVVBSFDHoPqaK9eVjl8s9">this</a> and plot like <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3iPMQ8SAI8hEeaRDhyfzhEmLUENbU90KelM5jXEKchXhmloNK9M6JigMNHrucMRIXFqD2Xvsmr5xIoSZMwDcZOrHD_2RM_2P_20NEU2e3TsUq5VHQAwEYykRFr7bHlI10ueyHCRLL5PF7/s1600/Avalon_vs_Valhalla_screen_cap5.png">this</a>.</br></br>
<b>Of course, you can design your own plot by using any combination of the ancient and present-day individuals and populations that I've already run in this PCA. Their coordinates are listed <a href="https://drive.google.com/open?id=1k5XB-kQUul8yFC5reGLQw2-7HOdxrBij">here</a>. Indeed, if you're in the possession of your own Celtic vs Germanic PCA coordinates, you can add yourself to the plot. And if you're not, see <a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">here</a>.</b></br></br>
It's also possible to re-process PCA data via the SOURCE tab. But I don't recommend doing this with the Celtic vs Germanic data, which are derived from a fine scale analysis and don't pack much variation. On the other hand, Global25 data are ideal for such re-processing. I made the plots below from subsets of Global25 coordinates available in a zip file <a href="https://drive.google.com/open?id=1I8HRIf-F61Ytk-Mhz-vmGnhvOk21fF6e">here</a>. To see how, refer to the screen caps <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjduPKtfUZLyea2YuptQHFs9FOhcW-Y8fcrLpDfAHmymcE5sEenQQ7EuTFG34KzT2r3jEjGLlxu7E9r87C9hj9cWScAfmjZChtMTlIzP3_tP8m36wmyc5VUQdppiIAxOpHGXNB6_0ZRDOXc/s1600/V_Custom_PCA_screen_cap4.png">here</a> and <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDOdjSTSELNoV6MDPEcS0uCLsQSWqSnF_vcgAk5qW-epFf2zv8Edqd_bqIPb_nTTtHblkJVGpnWOqaC4cBfAf0ii6DAS6AND2n4Z2h4JM5iaweBP3mRmVCdSeCwRuD9uO7S3pSYVZkfDpA/s1600/V_Custom_PCA_screen_cap5.png">here</a>.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMWaGnLpLSJbuUPLLxJZwE7XaPaL2HqXMNvnwvx-5JnSlY_aWZBHWZIDGgqmCIq_z2Jjn8gQQnNZ-BNqr_LdE-Ng3sWFweHox54Q12ETlumA6Q8PNRgAwy1ZokEJnfRq2NxbLxbOuDV2rm/s1600/V_Custom_PCA_screen_cap1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMWaGnLpLSJbuUPLLxJZwE7XaPaL2HqXMNvnwvx-5JnSlY_aWZBHWZIDGgqmCIq_z2Jjn8gQQnNZ-BNqr_LdE-Ng3sWFweHox54Q12ETlumA6Q8PNRgAwy1ZokEJnfRq2NxbLxbOuDV2rm/s480/V_Custom_PCA_screen_cap1.png" data-original-width="1299" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvx7qBvT-mPvWDckvlyJgsooRqswnt_JBhuBhJUr5MMCUX6zEen-8Qtc8fqeGHpJagVISp1qWbDzodds-qj0KPYe0fgBwwdO9P69uM1eGFY5FuhEP0SMIwRyw7t3qNl_lFn4ug5ARcjqqI/s1600/V_Custom_PCA_screen_cap2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvx7qBvT-mPvWDckvlyJgsooRqswnt_JBhuBhJUr5MMCUX6zEen-8Qtc8fqeGHpJagVISp1qWbDzodds-qj0KPYe0fgBwwdO9P69uM1eGFY5FuhEP0SMIwRyw7t3qNl_lFn4ug5ARcjqqI/s480/V_Custom_PCA_screen_cap2.png" data-original-width="1299" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwaBDVP7kz2cKoK609FkbVXV5IBj1hdVMkIHlSE56h1WCjJHS_MIxsIlhLgPEz9O9rDdwLPtW06URyyHAAO-V74dYwItFmifHtfKXNjZ5gNwBE-ubUd5vGlWLjZAo9mmODMGbskVma8y8Q/s1600/V_Custom_PCA_screen_cap3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwaBDVP7kz2cKoK609FkbVXV5IBj1hdVMkIHlSE56h1WCjJHS_MIxsIlhLgPEz9O9rDdwLPtW06URyyHAAO-V74dYwItFmifHtfKXNjZ5gNwBE-ubUd5vGlWLjZAo9mmODMGbskVma8y8Q/s480/V_Custom_PCA_screen_cap3.png" data-original-width="1299" data-original-height="669" /></a></div></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2019/11/modeling-your-ancestry-has-never-been_30.html">Modeling your ancestry has never been easier</a></br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a></br></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-41200787536716496992019-07-12T07:58:00.001-07:002021-07-29T18:51:04.239-07:00Getting the most out of the Global25</br>The first thing you need to know about the <a href="https://bga101.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Global25</a> is that I update the relevant datasheets regularly, usually every few weeks, but they're always at these links:</br></br>
<blockquote><a href="https://drive.google.com/open?id=1UrhcfNMLW0oMXIbHGUE60v2taCM7PFw1">Global25 datasheet ancient scaled</a></br></br>
<a href="https://drive.google.com/open?id=1F2rKEVtu8nWSm7qFhxPU6UESQNsmA-sl">Global25 pop averages ancient scaled</a></a></br></br>
<a href="https://drive.google.com/open?id=1YKkEOtyV5SISvmY_FyS4YSLXCxxYt5_W">Global25 datasheet ancient</a></br></br>
<a href="https://drive.google.com/open?id=1f0imQyVNZ9RPESNAYIeIkA8fx4wAVNYo">Global25 pop averages ancient</a></br></br>
...</br></br>
<a href="https://drive.google.com/open?id=1HYrDwxEXv82DvDLoq736pS5ZTGJA4dn5">Global25 datasheet modern scaled</a></br></br>
<a href="https://drive.google.com/open?id=1wZr-UOve0KUKo_Qbgeo27m-CQncZWb8y">Global25 pop averages modern scaled</a></br></br>
<a href="https://drive.google.com/open?id=18GcEVEl3GI-ByviD-TgQQjvEaaTbNTr2">Global25 datasheet modern</a></br></br>
<a href="https://drive.google.com/open?id=1y49hyvviJpHj9esVqyeiFm32DhnPlfRQ">Global25 pop averages modern</a></blockquote></br>
Each sample has a population code and an individual code. The population codes represent the countries, ethnic groups and/or archeological affinities of the samples, and I often modify these codes to suit my needs. On the other hand, the individual codes are unique to most of the samples and I usually don't change them.</br></br>
So if you'd like to know more details about the samples try searching for their individual codes via a decent online search engine. Basic information about many of the samples is also available in the "anno" files <a href="https://reich.hms.harvard.edu/downloadable-genotypes-worlds-published-ancient-dna-data">here</a>.</br></br>
The main purpose of the Global25 is to provide data for mixture modeling. In other words, for estimating ancestry proportions, both ancient and modern (see <a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">here</a>). This can be done on your computer with the R program and the nMonte R script, or online with a couple of different tools, which I discuss below.</br></br>
If you don't have R installed on your computer, you can get it <a href="https://cloud.r-project.org/">here</a>, while nMonte is available <a href="https://www.dropbox.com/sh/1iaggxyc2alafow/AACIjLtnkuaNNsJ5oKME_3XHa?dl=0">here</a>. For this tutorial please download nMonte and nMonte3, and store them in your main working folder (usually My Documents).</br></br>
Once you have R set up, make sure its working directory is the same place where you stored nMonte. You can check this in R by clicking on "File" and then "Change dir". Additionally, you'll need two nMonte input files in the working directory titled "data" and "target". Examples of these files are available <a href="https://drive.google.com/open?id=1WZ-r84g4r7rPFtZWjPVVuYObUnJeb779">here</a>. We'll be using them to test the ancient ancestry proportions of a sample set from present-day England.</br></br>
Before you can begin the analysis you need to first call the nMonte script by typing or copy pasting <i>source('nMonte.R')</i> into the R console window, and then hitting "enter" on your keyboard. This is what you should see in the R console window afterwards.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY4XaB1vdw7A5kJOBORo4ruV0r5o0hrIBCHHOYeXrUbFwc7P1_stVkAnB9sqonJ4yhNljVIum0PcYKho-2pjZktiXVF1fV46EfWmpGceim9Hg28RrrnRBuzta8UMwvE0_d4-IJCpj1KiA/s1600/G25_guide1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY4XaB1vdw7A5kJOBORo4ruV0r5o0hrIBCHHOYeXrUbFwc7P1_stVkAnB9sqonJ4yhNljVIum0PcYKho-2pjZktiXVF1fV46EfWmpGceim9Hg28RrrnRBuzta8UMwvE0_d4-IJCpj1KiA/s430/G25_guide1.png" data-original-width="681" data-original-height="472" /></a></div></br>
To start the mixture modeling process, type or copy paste <i>getMonte('data.txt', 'target.txt')</i> into the R console window, hit "enter", and wait for the results. After a short time, probably less than a minute or two, you should see this output.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwEzGUJv2bCEN6w6AqeTwCItLpYx3mNUht0c0t_JKP9t68me-vrnU4ZHJWfU7vzECBQcKoA3PvAXS09o76S6r7IiV0V-k5ribAOD0dep3_-KSVHtEL1Silm-t_u0m2z8n8HqhihEzi3Qk/s1600/G25_guide2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwEzGUJv2bCEN6w6AqeTwCItLpYx3mNUht0c0t_JKP9t68me-vrnU4ZHJWfU7vzECBQcKoA3PvAXS09o76S6r7IiV0V-k5ribAOD0dep3_-KSVHtEL1Silm-t_u0m2z8n8HqhihEzi3Qk/s430/G25_guide2.png" data-original-width="681" data-original-height="472" /></a></div></br>
The data and target files contain population averages. And, as you can see, the results that these population averages have produced are in line with what one would expect from such a model focusing on the genetic shifts in Northern Europe during the Late Neolithic. Very similar ancient ancestry proportions have been reported for the English and other Northern Europeans recently in scientific literature.</br></br>
However, when focusing on exceptionally fine-scale genetic variation that isn't reflected too well in the Global25 population averages, a more effective strategy <i>might be</i> to use multiple individuals from each reference population and let nMonte3 aggregate and average the inferred ancestry proportions.</br></br>
This is often the case when attempting to model ancestry proportions for more recent periods, such as the Middle Ages. So let's try this with the English sample set using a modified data file, which is available <a href="https://drive.google.com/open?id=1UXX9tbNfAebphVPG2jCz-c3WyyPGs7Uw">here</a>.</br></br>
Replace the old data file with the new one in your working directory, and, like before, copy paste into the R console window the following two commands, hitting "enter" after each one: <i>source('nMonte3.R')</i> and <i>getMonte('data.txt', 'target.txt')</i>. This is what you should eventually see.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCaE-Ktk5enpFg31aM5zhycVA8NFBuCkS_o1spEvaOMKJRjU206p07B_v3sySfkCmE-iw4xQDgJGlLUG_r9wopsBHOOYd_6LjAQhrwsxOc3CTJqTRxCrsj_n5Z2Z43u0IjtOXUNTh1G-I/s1600/G25_guide3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCaE-Ktk5enpFg31aM5zhycVA8NFBuCkS_o1spEvaOMKJRjU206p07B_v3sySfkCmE-iw4xQDgJGlLUG_r9wopsBHOOYd_6LjAQhrwsxOc3CTJqTRxCrsj_n5Z2Z43u0IjtOXUNTh1G-I/s430/G25_guide3.png" data-original-width="681" data-original-height="472" /></a></div></br>
It's difficult to say how accurate these estimates are. But they look more or less correct considering the limited and less than ideal reference samples. For instance, the individuals labeled SWE_Viking_Age_Sigtuna are supposed to be stand ins for Danish and Norwegian Vikings, but they're a relatively heterogeneous group from Sweden, possibly with some British or Irish ancestry, so they might be skewing the results.</br></br>
However, I'll be adding many more ancient samples to the Global25 datasheets as they become available, including lots of new Vikings, which should greatly improve the accuracy of these sorts of fine-scale mixture models.</br></br>
An exceedingly simple, yet feature-packed, online tool ideal for modeling ancestry with Global25 coordinates is the VahaduoJS. It's freely available <a href="https://vahaduo.github.io/vahaduo/">HERE</a>, and it works offline too after downloading the web page. Just copy paste the coordinates of your choice under the "source" and "target" tabs, and then mess around with the buttons to see what happens. The screen caps below show me doing just that.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW-dyAjTK2Ac4_fFWeUXV5za8H2eUAt8ORr4-46m8iu-7TXR3p46xTOTaeoMlIcV_TYfLOJKO-6v7AZCGGt5jEPYZymub0B44LB0tIOiEtyULI3Jfblv8tTlqO3ua8DfXFJAwvNaXmdJ3T/s1600/V_guide1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW-dyAjTK2Ac4_fFWeUXV5za8H2eUAt8ORr4-46m8iu-7TXR3p46xTOTaeoMlIcV_TYfLOJKO-6v7AZCGGt5jEPYZymub0B44LB0tIOiEtyULI3Jfblv8tTlqO3ua8DfXFJAwvNaXmdJ3T/s420/V_guide1.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXlMLZEv6z4l8YQOSZ1ZYEC7uML8T01yOTiPh26ZlOEi9UQltaLzU-SL8gfwLZYyLhwaLB7vUdfwRkJ5jylI2lfrS5VKbLorqAyILm_35c-MAoV5JrnG5e9OME3DmkAM1Qu-_dw6n9UQDm/s1600/V_guide2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXlMLZEv6z4l8YQOSZ1ZYEC7uML8T01yOTiPh26ZlOEi9UQltaLzU-SL8gfwLZYyLhwaLB7vUdfwRkJ5jylI2lfrS5VKbLorqAyILm_35c-MAoV5JrnG5e9OME3DmkAM1Qu-_dw6n9UQDm/s420/V_guide2.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimEEnB7GRuTHRMMuStuLPPv0nYOoG7qkGeT6PmwZ9HcfhyphenhyphenHfa9bOXP5Np0_JhAxMS7DJoc9kHbeh1XtLmiLkx1ziTi40S6LrXp8HfRsmIT_Uf3zztl_E8D_oyHY-zkyaRGDO8qWBTcQ4ih/s1600/V_guide3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimEEnB7GRuTHRMMuStuLPPv0nYOoG7qkGeT6PmwZ9HcfhyphenhyphenHfa9bOXP5Np0_JhAxMS7DJoc9kHbeh1XtLmiLkx1ziTi40S6LrXp8HfRsmIT_Uf3zztl_E8D_oyHY-zkyaRGDO8qWBTcQ4ih/s420/V_guide3.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYBIz82u2Cv-C6oKQDk-sS4N3rgB3VHmViemVS_0SDMVAw4lylYYEz-2ZseJvl0cAC01RF-_urqdAUv05u3lGaxRxn_bTBge0T02FUjcrK_jlm1ENo99awzc1QFCd7RzseQR88-XFou5AC/s1600/V_guide4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYBIz82u2Cv-C6oKQDk-sS4N3rgB3VHmViemVS_0SDMVAw4lylYYEz-2ZseJvl0cAC01RF-_urqdAUv05u3lGaxRxn_bTBge0T02FUjcrK_jlm1ENo99awzc1QFCd7RzseQR88-XFou5AC/s420/V_guide4.png" data-original-width="1300" data-original-height="669" /></a></div></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZTBy1pU0aXS-vHsoDMS22f_Ute3HJAnpjLyXnA7SfEqhGFy27jaGiSD2TBmFCCV0RpI5YEo-ZN1QugdXErpivb125Ub3vY8NQrVHKs61RsdWwDzvAcW5SH0R6N5UgopWtBqluzV5Lq9Ye/s1600/V_guide5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZTBy1pU0aXS-vHsoDMS22f_Ute3HJAnpjLyXnA7SfEqhGFy27jaGiSD2TBmFCCV0RpI5YEo-ZN1QugdXErpivb125Ub3vY8NQrVHKs61RsdWwDzvAcW5SH0R6N5UgopWtBqluzV5Lq9Ye/s420/V_guide5.png" data-original-width="1300" data-original-height="669" /></a></div></br>
However, it's important to note that the Global25 is a Principal Component Analysis (PCA), so it makes good sense to also use it for producing PCA graphs. To do this just plot any combination of two or three of its Principal Components (PCs) to create 2D or 3D graphs, respectively. This can be done with a wide variety of programs, including PAST, which is freely available <a href="https://folk.uio.no/ohammer/past/index.html">here</a>.</br></br>
To produce a 2D graph, open a Global25 datasheet in PAST, choose comma as the separator, highlight any two columns of data, click on the "Plot" tab and, from the drop down list, pick "XY graph". Below is a series of graphs that I created in exactly this way. I also color coded the samples according to their geographic origins. This was done by ticking the "Row attributes" tab.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMqCYJGM8dhEjyDW8Z9BYuVctgPa_HsclUOI8zDpF38ISw816zyJrIxTPEdCasmMfGhbMufUwX0CAWKvl-6_70FRMR8f8JDo7OMjee9dI3wX6gjyN4jGUk9yRSjHA_QEZ-a-XBBH3Z_NQ/s1600/G25_PCA.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMqCYJGM8dhEjyDW8Z9BYuVctgPa_HsclUOI8zDpF38ISw816zyJrIxTPEdCasmMfGhbMufUwX0CAWKvl-6_70FRMR8f8JDo7OMjee9dI3wX6gjyN4jGUk9yRSjHA_QEZ-a-XBBH3Z_NQ/s460/G25_PCA.png" data-original-width="1252" data-original-height="1558" /></a></div></br>
PAST can also be used to run PCA on subsets of the Global25 scaled data to produce remarkably accurate plots of fine-scale population structure. For instance, here's a plot based on present-day populations from north of the Alps, Balkans and Pyrenees.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfprzi6fTubvtOEkKJXsRlpOjTw_uwBbeAP9I8yB6rY0q2dpf4oAZ94oTxZNMfNOG8Q6LGRu_EtObGTAhghxjYA-ufIpOj-WUsdLFkbH0g0I-JkwC-L38he1X-LH3U7T0IL-gKWV9DrSO8/s1600/G25_North_Euro_PCA.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfprzi6fTubvtOEkKJXsRlpOjTw_uwBbeAP9I8yB6rY0q2dpf4oAZ94oTxZNMfNOG8Q6LGRu_EtObGTAhghxjYA-ufIpOj-WUsdLFkbH0g0I-JkwC-L38he1X-LH3U7T0IL-gKWV9DrSO8/s480/G25_North_Euro_PCA.png" data-original-width="1081" data-original-height="560" /></a></div></br>
To try this create a new text file with your choice of populations from the Global25 scaled datasheet, open it with PAST and choose Multivariate > Ordination > Principal Components Analysis. I've already put together several datasheets limited to European, Northern European, West Eurasian and South Asian populations. They're available at the links below along with more details on how to run them with PAST.</br></br>
<blockquote><a href="https://bga101.blogspot.com/2018/08/global25-workshop-1-that-classic-west.html">Global25 workshop 1: that classic West Eurasian plot</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-2-intra-european.html">Global25 workshop 2: intra-European variation</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://eurogenes.blogspot.com/2018/08/the-south-asian-cline-that-no-longer.html">The South Asian cline that no longer exists</a></blockquote></br>
Another free, easy to use online tool that works with Global25 coordinates is the Vahaduo Global25 Views [<a href="https://vahaduo.github.io/g25views/#">LINK</a>]. Below is a screen cap of me checking out one of the many PCA that it offers.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN8tph3t5ZarRmQqnVILNPBTFxcmdj4eRN2NENtzVQb7-AVX0Ojc0wkgDpYrAIP8AQlGfQAo-QSJy5VHad6FtHuRcaTgfsa2VSbg0BFEjlXrm5V80MMJFzceWx9hsMHYs6pB0fLhT29Ej5/s1600/Vahaduo_g25views.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN8tph3t5ZarRmQqnVILNPBTFxcmdj4eRN2NENtzVQb7-AVX0Ojc0wkgDpYrAIP8AQlGfQAo-QSJy5VHad6FtHuRcaTgfsa2VSbg0BFEjlXrm5V80MMJFzceWx9hsMHYs6pB0fLhT29Ej5/s480/Vahaduo_g25views.png" data-original-width="1300" data-original-height="668" /></a></div></br>
And if you're fond of tree-like structures as a means to describe fine-scale genetic variation, please see this blog post...</br></br>
<blockquote><a href="https://eurogenes.blogspot.com/2019/07/global25-workshop-4-neighbor-joining.html">Global25 workshop 4: a neighbour joining tree</a></blockquote></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2020/08/new-global25-interpretation-tools.html">New Global25 interpretation tools</a></br></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-72653253010211089472019-06-09T18:34:00.001-07:002019-10-23T03:55:12.984-07:00Global25 nMonte runner<br>Those of you who are having trouble with making use of your Global25 coordinates on your own computers, please be aware that there's an online tool that might be of help. It's called the <a href="http://185.144.156.77:3000/">Global25 nMonte runner</a> and very easy to use. For more info see <a href="https://anthrogenica.com/showthread.php?14849-Automated-Global25-nMonte-entirely-from-the-web">here</a>.<br><br>
<div class="separator" style="clear: both; text-align: center;"><a href="http://185.144.156.77:3000/" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhG5l4L4C4iQre-g92kmRAs4QUY6QuP3G7RsyjRKCGLTkOh6ZHBCL2d4dH-jDbguoob_Q02TFJpPKxLuMlNwabg9tlwe8_jcW1lfjuijx1PApjZJuhQapii0y743Xu9kz_tGy-S3Tex3R0/s450/Web_runner_screen_cap.jpg" data-original-width="750" data-original-height="594" /></a></div><br>
See also...<br><br>
<a href="https://bga101.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Genetic ancestry online store (to be updated regularly)</a><br><br>
<a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a><br><br>
<a href="https://bga101.blogspot.com/2018/03/if-youre-using-my-tools-to-find-jewish.html">If you're using my tools to find Jewish ancestry please read this</a><br><br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-28468013777252131362018-08-25T23:34:00.002-07:002019-10-23T03:52:07.283-07:00Global25 workshop 3: genes vs geography in Northern Europe</br>To produce the intra-North European Principal Components Analysis (PCA) plot below, download this <a href="https://drive.google.com/file/d/1HvtcnHd0i6BaocSGmgSAOU78NG7gIWZz/view?usp=sharing">datasheet</a>, plug it into the PAST program, which is freely available <a href="https://folk.uio.no/ohammer/past/index.html">here</a>, then select all of the columns by clicking on the empty tab above the labels, and choose Multivariate > Ordination > Principal Components or Discriminant Analysis. This is what you should end up with...</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfprzi6fTubvtOEkKJXsRlpOjTw_uwBbeAP9I8yB6rY0q2dpf4oAZ94oTxZNMfNOG8Q6LGRu_EtObGTAhghxjYA-ufIpOj-WUsdLFkbH0g0I-JkwC-L38he1X-LH3U7T0IL-gKWV9DrSO8/s1600/G25_North_Euro_PCA.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfprzi6fTubvtOEkKJXsRlpOjTw_uwBbeAP9I8yB6rY0q2dpf4oAZ94oTxZNMfNOG8Q6LGRu_EtObGTAhghxjYA-ufIpOj-WUsdLFkbH0g0I-JkwC-L38he1X-LH3U7T0IL-gKWV9DrSO8/s480/G25_North_Euro_PCA.png" data-original-width="1081" data-original-height="560" /></a></div>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_tEjUxhmte-8NvmnD-pqnoeRfrGTyLIb3zq4Az-VW5xEdW6tKVwDU8mC7WRnP9LRXD3dqIr5yXlBbZzglrZdzJ4h4ln4zOLmPbvF3G3fA02gzkR683frtvNsJcO0ApkF15a2NMX1g_RnU/s1600/G25_North_Euro_LDA.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_tEjUxhmte-8NvmnD-pqnoeRfrGTyLIb3zq4Az-VW5xEdW6tKVwDU8mC7WRnP9LRXD3dqIr5yXlBbZzglrZdzJ4h4ln4zOLmPbvF3G3fA02gzkR683frtvNsJcO0ApkF15a2NMX1g_RnU/s400/G25_North_Euro_LDA.png" data-original-width="663" data-original-height="541" /></a></div></br>
I'd say that the result more or less resembles a geographic map of Northern Europe. Of course, if you're in the possession of your own personal Global25 coordinates, you can add yourself to this plot to check whether your position matches your geographic origin.</br></br>
Please keep in mind, however, that the vast majority (>90%) of your ancestry must be from north of the Alps, Balkans and Pyrenees to obtain a sensible outcome. Also please ensure that all of the columns in the datasheet are filled out correctly, including the group column, otherwise your position on the plot will be skewed.</br></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-1-that-classic-west.html">Global25 workshop 1: that classic West Eurasian plot</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-2-intra-european.html">Global25 workshop 2: intra-European variation</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://eurogenes.blogspot.com/2019/07/global25-workshop-4-neighbor-joining.html">Global25 workshop 4: a neighbour joining tree</a></br></br>
<a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a></br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Genetic ancestry online store (to be updated regularly)</a><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-17303919307976375372018-08-12T17:00:00.003-07:002019-10-23T03:51:05.791-07:00Global25 workshop 2: intra-European variation</br>Even though the Global25 focuses on world-wide human genetic diversity, it can also reveal a lot of information about genetic substructures within continental regions.</br></br>
Several of the dimensions, for instance, reflect Balto-Slavic-specific genetic drift. I ensured that this would be the case by running a lot of Slavic groups in the analysis. A useful by-product of this strategy is that the Global25 is very good at exposing relatively recent intra-European genetic variation.</br></br>
To see this for yourself, download the datasheet below and plug it into the PAST program, which is freely available <a href="https://folk.uio.no/ohammer/past/index.html">here</a>. Then select all of the columns by clicking on the empty tab above the labels, and choose Multivariate > Ordination > Principal Components.</br></br>
<blockquote><a href="https://drive.google.com/file/d/16yMI69rg07nhBpZOcuxIljFIBzlT6eJ0/view?usp=sharing">G25_Europe_scaled.dat</a></br></blockquote></br>
You should end up with the plot below. Note that to see the group labels and outlines, you need to tick the appropriate boxes in the panel to the right of the image. To improve the experience, it might also be useful to color-code different parts of Europe, and you can do that by choosing Edit > Row colors/symbols. Of course, if you have Global25 coordinates you can add yourself to the datasheet to see where you plot.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEnrjJ6PSXcrLze8fNETq3j_yw9THRcXJAWiaej4JlCHc5MZSKjccS7IKprad9kcagabos2gJS7ou3k13IiL19nLACo2hNDlw5c1Utv2WQYyBOkxqwoWbTiHW5QYZY9_aOCDIR1AxkBe2-/s1600/G25_Europe_1%25262.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEnrjJ6PSXcrLze8fNETq3j_yw9THRcXJAWiaej4JlCHc5MZSKjccS7IKprad9kcagabos2gJS7ou3k13IiL19nLACo2hNDlw5c1Utv2WQYyBOkxqwoWbTiHW5QYZY9_aOCDIR1AxkBe2-/s480/G25_Europe_1%25262.png" data-original-width="1366" data-original-height="728" /></a></div></br>
Components 1 and 2 pack the most information and, more or less, recapitulate the geographic structure of Europe. However, many details can only be seen by plotting the less significant components. For instance, a plot of components 1 and 3 almost perfectly separates Northeastern Europe into two distinct clusters made up of the speakers of Indo-European and Finno-Ugric languages.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIYOwJ_Whoq8IaK8zoGzz6urEl3mlN49XjUUSToK4RptKhmEhh0eOlGBEz2LV57I2yjoMtm4mp6q_quR5U9wvRZJhRKoz438W_DWEYjbC63EMiXxFRBWRDEAD_IpaYb1PytNXH6uoLvkvz/s1600/G25_Europe_1%25263.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIYOwJ_Whoq8IaK8zoGzz6urEl3mlN49XjUUSToK4RptKhmEhh0eOlGBEz2LV57I2yjoMtm4mp6q_quR5U9wvRZJhRKoz438W_DWEYjbC63EMiXxFRBWRDEAD_IpaYb1PytNXH6uoLvkvz/s480/G25_Europe_1%25263.png" data-original-width="1366" data-original-height="728" /></a></div></br>
This plot might also be useful for exploring potential Jewish ancestry, because Ashkenazi, Italian and Sephardi Jews appear to be relatively distinct in this space. Thus, people with significant European Jewish ancestry will "pull" towards the lower left corner of the plot. For example, someone who is half Ashkenazi and half German will probably land in the empty space between the Northwest Europeans and Jews.</br></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-1-that-classic-west.html">Global25 workshop 1: that classic West Eurasian plot</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://eurogenes.blogspot.com/2019/07/global25-workshop-4-neighbor-joining.html">Global25 workshop 4: a neighbour joining tree</a></br></br>
<a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a></br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Genetic ancestry online store (to be updated regularly)</a><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-15404402013651466562018-08-12T17:00:00.001-07:002019-10-23T03:50:21.072-07:00Global25 workshop 1: that classic West Eurasian plot</br>In this Global25 workshop I'm going to show you how to reproduce the classic plot of West Eurasian genetic diversity seen regularly in ancient DNA papers and at this blog (for instance, <a href="https://eurogenes.blogspot.com/2018/05/new-pca-featuring-botai-horse-tamers.html">here</a>). To do this you'll need the datasheet below, which I'll be updating regularly, and the PAST program, which is freely available <a href="https://folk.uio.no/ohammer/past/index.html">here</a>.</br></br>
<blockquote><a href="https://drive.google.com/file/d/1ydPL8HOTBd54P3Uu3idth1EJU6TpU4MW/view?usp=sharing">G25_West_Eurasia_scaled.dat</a></blockquote></br>
Download the datasheet, plug it into PAST, select all of the columns by clicking on the empty cell above the labels, and go to Multivariate > Ordination > Principal Components. Here's a screen cap of me doing it:</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjD8-m3wYGsEg-__JEHKHGKEnljExxfF9Jngyj1NRbyKHo25UGLaS8zE6nkXSRXFSqCrl-fjYurLGb-zOnEpQscHdunoh0XyOsJQyWYDi46oJPqkXO9TG2Yr0QL5Ssju_cXk5ZMEWa5PLJ-/s1600/G25_West_Eurasia_guide1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjD8-m3wYGsEg-__JEHKHGKEnljExxfF9Jngyj1NRbyKHo25UGLaS8zE6nkXSRXFSqCrl-fjYurLGb-zOnEpQscHdunoh0XyOsJQyWYDi46oJPqkXO9TG2Yr0QL5Ssju_cXk5ZMEWa5PLJ-/s430/G25_West_Eurasia_guide1.png" data-original-width="1366" data-original-height="685" /></a></div></br>
This is what you should end up with. Please note that I also ticked the "convex hulls" box to define the populations from the "group" column in the datasheet.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0hmqbLY_eHkS0RyU2rvqIH3Z3iEzTaZVoTmDy3VMtEIZfpJxHuMZ1hNH5E-LA1u9TTAPLEVG5HxFnOXOSp5DTW7lmVfYYKTGQMQ_IlSLMRM8vkY6T_xnW_72JwFf3hZoptbjInO2HlIHi/s1600/G25_West_Eurasia_guide2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi0hmqbLY_eHkS0RyU2rvqIH3Z3iEzTaZVoTmDy3VMtEIZfpJxHuMZ1hNH5E-LA1u9TTAPLEVG5HxFnOXOSp5DTW7lmVfYYKTGQMQ_IlSLMRM8vkY6T_xnW_72JwFf3hZoptbjInO2HlIHi/s430/G25_West_Eurasia_guide2.png" data-original-width="1366" data-original-height="705" /></a></div></br>
Here I also ticked the "group labels" box. It's generally a useful feature, even though it makes a mess of the plot in this case due to the large number of populations.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgw7mphsdpofrgPp-h2gUa2m4clu1X6EThZbmKQ_XbB2OBb6UwMZHq-15XtLqm1vMvc_9WrLHZUxgVqF6kJtw52MSruZJYKGn_hyoZBCcFeV7POLYjljBtwjYIn5-wlSaen0LaYzQhGsTr3/s1600/G25_West_Eurasia_guide3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgw7mphsdpofrgPp-h2gUa2m4clu1X6EThZbmKQ_XbB2OBb6UwMZHq-15XtLqm1vMvc_9WrLHZUxgVqF6kJtw52MSruZJYKGn_hyoZBCcFeV7POLYjljBtwjYIn5-wlSaen0LaYzQhGsTr3/s430/G25_West_Eurasia_guide3.png" data-original-width="1366" data-original-height="705" /></a></div></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-2-intra-european.html">Global25 workshop 2: intra-European variation</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://eurogenes.blogspot.com/2019/07/global25-workshop-4-neighbor-joining.html">Global25 workshop 4: a neighbour joining tree</a></br></br>
<a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a></br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Genetic ancestry online store (to be updated regularly)</a><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-37399495194824313982018-03-19T03:48:00.000-07:002019-10-23T03:49:31.956-07:00If you're using my tools to find Jewish ancestry please read this<br/>It's come to my attention that many people are still using the <a href="https://bga101.blogspot.com/2012/09/eurogenes-ashkenazim-ancestry-test-files.html">Jtest</a> and taking the results very seriously. Indeed, perhaps too seriously.<br/><br/>
Also, some users are doing weird stuff with the Jtest output in an attempt to estimate their supposedly "true" Ashkenazi ancestry proportions, like multiplying their Ashkenazi coefficient by three, because Ashkenazi Jews "only" score around 30% Ashkenazi in this test. Ouch! Please don't do that!<br/><br/>
Let me reiterate that this test was only supposed to be a fun experiment. It was never meant to be the definitive online Ashkenazi ancestry test. And even as fun experiments with ADMIXTURE go, it's now horribly outdated, and probably useless for anyone with less than 15-20% Ashkenazi ancestry.<br/><br/>
So it might be time to move on. If you really want to confirm your Jewish ancestry, either or both Ashkenazi and Sephardi, then you need to look at much more powerful and sophisticated options. One of these options is the <b>Global25</b> analysis (see <a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">HERE</a>), which can pick up minor Jewish ancestry of just a few per cent. But it's not free (USD $12), and it's a DIY test that requires a bit of time and effort to get the most out of it. Also, you'd need to send me your autosomal file so that I can estimate your Global25 coordinates. But I can help you get started and even quickly check if you have any hope at all of confirming Jewish ancestry.<br/><br/>
If, for whatever reason, you'd rather not take advantage of the Global25 offer, because, say, you don't want to share your data with me, then it might be an idea to join the Anthrogenica discussion board and ask the experienced members there about other options [<a href="https://anthrogenica.com/forum.php">LINK</a>].<br/><br/>
In any case, whatever you choose to do, please remember the following points, and feel free to share them with others who are still using the Jtest:<br/><br/>
<blockquote><b>- do not multiply your Jtest Ashkenazi score by 3 in an attempt to find your "true" Ashkenazi ancestry proportion, because this won't work for the vast majority of users<br/><br/>
- but do compare your Jtest Ashkenazi score to those of other people of the same or very similar ancestry to yours to get a rough idea whether you might have any Ashkenazi ancestry (the Jtest population averages will be useful for this, see <a href="https://docs.google.com/spreadsheets/d/1XgXrkqivuGCbYocBm_MMbdRd2tX1A68D3bl_wJN3pUM/edit?usp=sharing">here</a>)<br/><br/>
- if you're still not sure what your Jtest results mean, then just focus on your Jtest Oracle-4 output at GEDmatch, and if you don't see AJ at the top of the oracle list, then this is a strong signal that you don't have substantial Ashkenazi ancestry</b></blockquote><br/>
See also...<br/><br/>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-1-that-classic-west.html">Global25 workshop 1: that classic West Eurasian plot</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-2-intra-european.html">Global25 workshop 2: intra-European variation</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://bga101.blogspot.com/2018/02/modeling-genetic-ancestry-with-davidski.html">Modeling genetic ancestry with Davidski: step by step</a></br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Genetic ancestry online store (to be updated regularly)</a><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-89368772355906409552018-02-15T18:01:00.002-08:002019-12-29T17:21:06.997-08:00Modeling genetic ancestry with Davidski: step by step</br>There are many different ways to model your genetic ancestry. I prefer the Global25/nMonte method (see <a href="https://eurogenes.blogspot.com/2018/02/unleash-power-global-25-test-drive.html">here</a>). This is a step by step guide to modeling ancient ancestry proportions with this simple but powerful method using my own genome.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://drive.google.com/file/d/1k6z0j1Rt0zSHwQqpJ2OO_xcxX9UX0Zbz/view?usp=sharing" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8XLofDFLYP2H1jLLnrd8jHQmWPbo3cgkhwjFPrtRbqhO7ugVJsmEnYJBkUlsg59fqRKGvFIaq7xZNjbQ86Q_1tgKe8Wp1cuTPSC7pEUUAPiMkZ3bFSTt8lAglc81P1Qdiv0DNPj0cmKA0/s1600/Global_25_graphic_small.png" data-original-width="300" data-original-height="383" /></a></div></br>
As far as I know, the vast majority of my recent ancestors came from the northern half of Europe. This may or may not be correct, but it gives me somewhere to start, so that I can come up with a coherent model. If you don't have this sort of information, because, perhaps, you were adopted, then just look in the mirror, and work from there. Like I say, it's not imperative that you know anything whatsoever about your ancestry, because your genetic data will do the talking, but you do need a model when modeling.</br></br>
In scientific literature nowadays Northern Europeans are often described as a three-way mixture between Yamnaya-related pastoralists, Anatolian-derived early farmers, and Western European Hunter-Gatherers (WHG). So let's see if this model works for me. Obviously, if it does, then it'll confirm the information that I have about my origins, but it might also reveal finer details that I'm not aware of. The datasheet that I'm using for this model is available <a href="https://drive.google.com/open?id=1F2rKEVtu8nWSm7qFhxPU6UESQNsmA-sl">here</a>.</br></br>
<blockquote><b>[1] distance%=6.9025 / distance=0.069025</br></br>
Davidski</br></br>
Yamnaya_Samara 53.9</br>
Barcin_N 30.75</br>
Rochedane 15.35</br>
Tepecik_Ciftlik_N 0</b></blockquote></br>
Yep, the model does work, with a fairly reasonable distance of almost 7%. The ancestry proportions more or less match those from scientific literature and the plethora of analyses that I've featured at this blog on the topic. Please note that I've kept things very simple, using only four reference populations and individuals as proxies for four distinct streams of ancestry. But I've put my own twist on this Neolithic/Bronze Age model by including two populations from Neolithic Anatolia (Barcin_N and Tepecik_Ciftlik_N), just to see what would happen. The WHG proxy is Rochedane.</br></br>
Admittedly, though, my Yamnaya cut of ancestry appears somewhat bloated at over 53%, and the model's distance is a little higher than what I normally see for really strong models. So let's check if I can get a better fitting and more sensible result by adding a slightly more easterly forager proxy than Rochedane: Narva_Lithuania.</br></br>
<blockquote><b>[1] distance%=5.9331 / distance=0.059331</br></br>
Davidski</br></br>
Yamnaya_Samara 45.75</br>
Barcin_N 31.45</br>
Narva_Lithuania 22.8</br>
Rochedane 0</br>
Tepecik_Ciftlik_N 0</b></blockquote></br>
The statistical fit does improve, and when given a choice between Rochedane and Narva_Lithuania, the algorithm picks the latter as the only source of extra forager input in my genome.</br></br>
What could this mean? It might mean that a large part of my ancestry derives from the Baltic region. Actually, I know for a fact that this is true. But even if I had no idea about my genealogy, this result would be a very strong hint about my genetic origins. Indeed, let's follow this trail and try to further improve the fit of the model by adding a more relevant Yamnaya-related proxy, such as early Baltic Corded Ware (CWC_Baltic_early).</br></br>
<blockquote><b>[1] distance%=5.444 / distance=0.05444</br></br>
Davidski</br></br>
CWC_Baltic_early 54.95</br>
Barcin_N 26.7</br>
Narva_Lithuania 18.35</br>
Rochedane 0</br>
Tepecik_Ciftlik_N 0</br>
Yamnaya_Samara 0</b></blockquote></br>
Holy shit! To be honest, I wasn't expecting this sort of resolution and accuracy, and I can't promise that everyone using the Global25/nMonte method will see such incredibly nuanced outcomes, but this isn't a fluke. It can't be, because it gels so well with everything that I know about my ancestry. Please note also that I belong to Y-chromosome haplogroup R1a-M417, which is a lineage intimately associated with the Corded Ware expansion across Northern Europe (for instance, see <a href="https://eurogenes.blogspot.com/2017/12/corded-ware-as-offshoot-of-hungarian.html">here</a>).</br></br>
But of course, the Baltic and nearby regions haven't been isolated from migrations and invasions since the Corded Ware times. For instance, at some point, probably during the Bronze Age, Uralic-speaking groups moved west across the forest zone of Northeastern Europe and into the East Baltic and northern Scandinavia. It's generally accepted that they brought Siberian admixture with them (see <a href="https://eurogenes.blogspot.com/2017/11/ancient-genomes-from-ne-europe-suggest.html">here</a>). Moreover, from the Iron Age to the Middle Ages, East Central Europe was under intense pressure from a wide range of nomadic steppe groups with complex ancestry, such as the Sarmatians, Avars, Huns, and Mongolians. Did any of these peoples leave their mark on my genome? At the risk of overfitting the model, let's explore this possibility by adding a few more reference populations.</br></br>
<blockquote><b>[1] distance%=5.444 / distance=0.05444</br></br>
Davidski</br></br>
CWC_Baltic_early 54.95</br>
Barcin_N 26.7</br>
Narva_Lithuania 18.35</br>
Han 0</br>
Mongolian 0</br>
Nganassan 0</br>
Rochedane 0</br>
Sarmatian_Pokrovka 0</br>
Tepecik_Ciftlik_N 0</br>
Yamnaya_Samara 0</b></blockquote></br>
Nothing changes when I add the Han Chinese, Mongolians, Nganassans (a Uralic group from Siberia), and Sarmatians to the model. But what about if I throw in the only ancient Slav in my datasheet?</br></br>
<blockquote><b>[1] distance%=2.9904 / distance=0.029904</br></br>
Davidski</br></br>
Slav_Bohemia 85.9</br>
CWC_Baltic_early 7.7</br>
Narva_Lithuania 6.4</br>
Barcin_N 0</br>
Rochedane 0</br>
Tepecik_Ciftlik_N 0</br>
Yamnaya_Samara 0</b></blockquote></br>
Considering that the vast majority of my recent ancestors were Poles, thus a Slavic-speaking people from near the Baltic, this outcome makes perfect sense. And check out the new distance! <b>But the problem now is that I'm overfitting the model by using two very similar and probably very closely related references, CWC_Baltic_early and Slav_Bohemia. And overfitting should be avoided at all costs.</b> So it might be useful to break up this effort into two models: one focusing on the Neolithic and Bronze Age, and the other on the Iron Age and Middle Ages.
I'll do that soon, but not just yet, because there are still too few Iron Age and Medieval samples available from the Baltic region and surrounds for meaningful analyses of this type.</br></br>
See also...</br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-1-that-classic-west.html">Global25 workshop 1: that classic West Eurasian plot</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-2-intra-european.html">Global25 workshop 2: intra-European variation</a></br></br>
<a href="https://bga101.blogspot.com/2018/08/global25-workshop-3-genes-vs-geography.html">Global25 workshop 3: genes vs geography in Northern Europe</a></br></br>
<a href="https://bga101.blogspot.com/2019/07/getting-most-out-of-global25.html">Getting the most out of the Global25</a></br></br>
<a href="https://eurogenes.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">Genetic ancestry online store (to be updated regularly)</a><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-33996410718444850662016-09-22T22:41:00.002-07:002019-10-23T03:44:56.248-07:00Orcadians, the K15 and the calculator effect</br>Judging by the Google search terms that are bringing traffic to this and my other blogs, a total newb to the scene is analyzing the Orcadian samples from the HGDP at GEDmatch with my <a href="https://bga101.blogspot.com/2013/10/eurogenes-k15-now-at-gedmatch.html">K15</a> test.</br></br>
Please keep in mind that you will not see coherent results for many of the academic samples available online when using my tests.</br></br>
That's because I used these samples to source the allele frequencies for the tests. As a result, their ancestry proportions will often be very different from those of other samples from the same ethnic groups that were not used in this way.</br></br>
I call this problem the calculator effect, and it's described in my blog posts at the links below:</br></br>
<blockquote><a href="https://bga101.blogspot.com/2012/05/beware-calculator-effect.html">
Beware the "calculator effect"</a></br></br>
<a href="https://eurogenes.blogspot.com/2014/10/ancient-genomes-and-calculator-effect.html">Ancient genomes and the calculator effect</a></blockquote></br>
The calculator effect is a very serious problem for most tests, but as far as my tests are concerned, it doesn't affect anyone except the above mentioned academic samples.</br></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-12799192434522910212015-07-22T17:23:00.000-07:002019-10-23T03:43:08.734-07:00Marker overlap and test accuracy<br/><span style="font-size:100%;"><span style="font-family: arial;">A few people are asking me about the effects of marker overlap or genotype rate on test accuracy. Logic dictates that the better the overlap, the more accurate the results, but this isn't strictly true. Here's what I've learned over the years:<br/><br/>
<blockquote>- accuracy doesn't necessarily improve with higher marker overlap, it improves (up to a certain point) with more markers<br/><br/>
- you will still see accurate results using as little as 25,000 SNPs, as long as the test doesn't suffer from any serious problems<br/><br/>
- poorly designed tests, such as those based on less than 1000 reference samples, always produce garbage results no matter what the marker overlap<br/><br/></blockquote>
In other words, a well designed test based on 200,000 SNPs will produce very accurate results for a genotype file with a marker overlap of 50%. On the other hand, another well designed test, based on just 50,000 SNPs, is likely to produce less accurate results for a genotype file with a marker overlap of 100%.<br/><br/>
So how can you tell a well designed test from a poorly designed one? It's easy, just have a look at the results they're producing for people with less complex ancestry. For instance, ask a Lithuanian, Swede or Pole what they're seeing at the top of their oracles. Is the Swede seeing Swedish or, say, German? If the answer is German instead of Swedish, or at least some type of Scandinavian, then the test is garbage and best ignored.<br/><br/>
By the way, the recent Allentoft et al. paper on the ancient genomics of Eurasia includes a useful discussion on the effects of missing markers on the accuracy of both ADMIXTURE and PCA results. Refer to section 6.2 in the freely available supplementary info PDF <a href="https://www.nature.com/nature/journal/v522/n7555/full/nature14507.html#supplementary-information">here</a>.<br/><br/></span>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-38800697212668284032015-05-12T19:22:00.003-07:002021-10-22T20:02:52.667-07:004mix: four-way mixture modeling in R<br/><span style="font-size:100%;"><span style="font-family: arial;">Thanks to Eurogenes project member DESEUK1. A zip file with the R script, instructions and a couple of data sheets is available <a href="https://drive.google.com/file/d/0B9o3EYTdM8lQZVdGMlFMZVFLR28/view?usp=sharing&resourcekey=0-zCZZMKlfaS5HOOxvIiYICg">here</a>.<br/><br/>
So let's model Poles as a bunch of ancient genomes from Central and Eastern Europe using output from my <a href="https://eurogenes.blogspot.com/2014/12/ane-is-primary-cause-of-west-to-east.html">K8 analysis</a>.<br/><br/>
<blockquote>Copy & Paste: source('4mix.r')<br/><br/>
Hit ENTER<br/><br/>
Copy & Paste: getMix('K8avg.csv', 'target.txt', 'HungaryGamba_EN', 'HungaryGamba_HG', 'Karelia_HG', 'Corded_Ware_LN')<br/><br/>
Hit ENTER<br/><br/>
After a few seconds you should see the results...<br/><br/>
Target = 19% HungaryGamba_EN + 14% HungaryGamba_HG + 2% Karelia_HG + 65% Corded_Ware_LN @ D = 0.0062<br/><br/></blockquote>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivvSVCAx5qUCyuamLYj26AzjclagyBDrBF0BmJ0hX5JmgZzfnPSRp4WbFtaROoLTtVEeNHtvXG7k70EXsyks473pzoc-29rDdwLqKDBf7D1_tTzNBSghL1SVL1bucxRAaV5UllvLMS9vu6/w600-h413-no/1.png" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="207 width=300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjK9m3Q_r1PbKk9SiPD0_R5-ESu9kWzK2hyphenhyphen3zDh1ol655lVpnUukN3238r7DZ8S8hw0zVEE6cOfasavY1-XXJbnb732JcTN6SjfysBuoTBT0gCtBZaXwL_RGRiM6UqKDtUI67Qxz0tNuUJH/w300-h207-no/1small.png"/></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9LRCVJe-2mP6w4YTlQoxO62ANlTALX_-zEURxX-WUa4po85nYOIzddPgOn8zx_Lo7Oe4bUGCDg6d2VJGeuHFfJrXnp5d-_mceqYh-4aK0I3siHkqrH-ul16hc18-e0c9edUyC1BeddHoz/w600-h413-no/2.png" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="207 width=300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrsDy3vViHBK76gJfszaVyIPJM3EgAkC4rcF-RCLXv2UAMDQgClMGMpNElYVQBRqkBmVwubh0qmjvSAncFx262Rf40k3Ws3d5XrYzdh6sQvXjI3kYbEmL7BfeOh_Yq8-N1b-dW944ECW5w/w300-h207-no/2small.png"/></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCkW-1mRlFNZGTRXnCt3eyaVFUNrsMDnCjMHEJqSlRCSJOx5uX8yK-E5A_RxfKl7tkQac2WhY3WWHFhimRh8i_uTw6TOeOXtMzOAZs_EflsskVIq5pk6jBHtTjHRIOEi1ZBNn1NVvrcZEa/w600-h413-no/3.png" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="207 width=300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMvDyUufW06Il7q-gsOBrznpL57D-eTsqmCm2qJWHq7LlX5hIaOPS2kxeUlJyK8_SZvGTu46QIlwQ-__wWTlSUYe8cUHUWDL9C0GGX0Yd59NWhhNysFSk4-BPJY-Y9j3GOZ3FSdiBBWlci/w300-h207-no/3small.png"/></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLMQmOtT1J5fnX6HN5nr6oNqNKV_rnia-Lexe8W916k0yzp30QIkjYzyh8HaXC_e8kiz8d0M8XTil6ueuybbxgeINSpleemnNPzvOvcjwMIranmw1j_2VMcCM6sh6hYvhMocgizQcAtShz/w600-h413-no/4.png" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="207 width=300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhk85KMOkjVK-57sJ0_u8Diihhfp1WJwa8rHRkmYsueORajUxKjCbg8UEpOkbK2IRFC_UrnQ_IIz_BMwGYb7g1-rw2QIwVAZOZyXJuCJf0_ykCLCmHQbSEhMOPczisT0LvkokdqHK_hltWl/w300-h207-no/4small.png"/></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtSVg8RsekjJp_ySfy-Y63TD8Jskiln5_1lfPZdBrJMpst-TDtvMAk-R_XCHLFBm5WllVLMpXAy-zpzrOw9toyo9qGOYFIhZInPn58VsWMhHOTc2eQzEj4kclBzJ1Se-AbUPbEAPh6IMOr/w600-h413-no/5.png" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="207 width=300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEissGZI6S-oaJz2br2zU_ROS9FlYJlC1GInv19KkXZgBexY56ijmM3JUDjQu2GMZi80l3ZJF3BsIBPLLN1YVAgWlDB4ayWj4XDVFy-XQG98YfXCOFS43XIR4R2mJR61FLKD0H7kZIdx_klx/w300-h207-no/5small.png"/></a></div><br/>
Obviously the script can use ancestry proportions and/or population averages from any test, provided they're formatted properly. The accuracy of the modeling will depend on the quality of the input.<br/><br/>
<b>Update 19/05/2015:</b> A new version of the 4mix script that can run multiple targets is available <a href="https://drive.google.com/file/d/0BxOaXGP88BrPbjJPaWxVbEFOMDQ/view?usp=sharing">here</a>, courtesy of Open Genomes.<br/><br/></span>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-63853832994900713502014-11-30T15:17:00.001-08:002019-10-23T03:39:29.888-07:00Short clip: The making of modern Europe<br/><span style="font-size:100%;"><span style="font-family: arial;">Simple but, I think, very cool animation: ten ancient genomes analyzed with the <a href="https://bga101.blogspot.com/2013/10/eurogenes-k15-now-at-gedmatch.html">Eurogenes K15</a>. More elaborate clips are on the way.<br/><br/>
<iframe width="550" height="280" src="//www.youtube.com/embed/n58rgWVUups" frameborder="0" allowfullscreen></iframe><br/><br/>
And this is basically the same thing, but restricted to samples from Hungary.<br/><br/>
<iframe width="550" height="280" src="//www.youtube.com/embed/w2CkLqm1dwM" frameborder="0" allowfullscreen></iframe>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-38963669654657201572014-07-16T03:06:00.001-07:002021-10-22T20:11:28.133-07:00Model yourself as a mixture of ancient genomes<br/><span style="font-size:100%;"><span style="font-family: arial;"><b>Update 12/05/2015:</b> <a href="https://bga101.blogspot.com/2015/05/4mix-four-way-mixture-modeling-in-r.html">4mix: four-way mixture modeling in R</a><br/><br/>
...<br/><br/>
This is really easy and should work well for most personal genomics customers (ie. those of European ancestry and with data files from 23andMe, FTDNA and AncestryDNA).<br/><br/>
First of all, make sure you have your Eurogenes K15 ancestry proportions from <a href="https://v2.gedmatch.com/select.php">GEDmatch</a>. Then do the following:<br/><br/>
<blockquote>- download the 4 Ancestors Oracle (<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQSVFBYmRWTU1GdEE/view?usp=sharing&resourcekey=0-0v70ObaPdpz0FjaLQS_iEQ">here</a>)<br/><br/>
- download the Eurogenes ancient genomes datasheet (<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQN2d3NmtWVDUyQzg/view?usp=sharing&resourcekey=0-80kOrbdM69t4HwApKDOCng">here</a>)<br/><br/>
- place everything into the same directory<br/><br/>
- double click of the 4 Ancestors Oracle icon (the big red number 4)<br/><br/>
- select the Eurogenes K15 ancient genomes datasheet<br/><br/>
- type your Eurogenes K15 ancestry proportions into the fields provided<br/><br/>
- hit the go button and let it rip</blockquote><br/>
I'm not sure I'm allowed to upload the 4 Ancestors Oracle online, but I couldn't find the original link, so let's assume for the time being that I am. In any case, many thanks to Alexandr Burnashev for this great tool.<br/><br/>
You'll also find some modern populations in the datasheet. They're there so that users with ancestry from outside of Europe don't end up with ridiculous results.<br/><br/>
Obviously, you can edit the datasheet to explore more options by removing or adding individuals and populations. A spreadsheet of Eurogenes K15 population averages is available <a href="https://docs.google.com/spreadsheets/d/19c_bZjUV_RouKyGyLHmMDw57WwAVabXFJOaso_gcuRE/edit?usp=sharing">here</a>. The oracle settings can also be tweaked in a couple of ways to fine tune the results.<br/><br/>
If the calculator crashes, try replacing the periods with commas in both the datasheet and your ancestry proportions.<br/><br/>
Please keep checking this post, because I'll attempt to update the datasheet at the link above every time a new ancient genome is published and has enough markers available to be tested with the Eurogenes K15. Eventually we might end up with a tool that covers most of the continents and many periods of history and prehistory.<br/><br/>
I've done similar analyses of a variety of ancient genomes. For instance, StoraFörvar11, or SfF11, from Mesolithic Sweden came out 3/4 La Brana-1 and 1/4 MA-1, which translates to 3/4 Western European Hunter-Gatherer (WHG) and 1/4 Ancient North Eurasian (ANE), and lines up well with results reported recently for Swedish hunter-gatherers in scientific literature. You can see the full analysis StoraFörvar11 and a couple of other ancient genomes at the links below.<br/><br/>
<a href="https://eurogenes.blogspot.com/2014/07/analysis-of-mesolithic-swedish-forager.html">Analysis of Mesolithic Swedish forager StoraFörvar11</a><br/><br/>
<a href="https://eurogenes.blogspot.com/2014/07/more-ancient-genomes-from-sweden-pitted.html">
More ancient genomes from Sweden: Pitted Ware forager Ajvide58 and TRB farm girl Gokhem2</a><br/><br/>
I'm still trying to answer a whole lot of e-mails so I won't be monitoring this post for a while. But please feel free to share your results and any tips you might have in the comments below.<br/><br/></a>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-77509971584460814602013-12-28T02:15:00.002-08:002021-10-22T20:08:39.382-07:00EEF-WHG-ANE test for Europeans<br/><span style="font-size:100%;"><span style="font-family: arial;">This test attempts to fit you to the three inferred prehistoric European populations as described in this recent <a href="https://biorxiv.org/content/early/2013/12/23/001552">preprint</a>. The relevant Excel file can be downloaded <a href="https://docs.google.com/spreadsheets/d/0B9o3EYTdM8lQNENBV3dMLW1ob2s/edit?usp=sharing&ouid=106454489917427406376&resourcekey=0-y5t45t61ClA6n1VuZhJI8A&rtpof=true&sd=true">here</a>, and all you have to do is stick your Eurogenes K13 results into the fields provided to get the EEF-WHG-ANE ancestry proportions. A modified version for Near Eastern and Southeast European users can be accessed <a href="https://docs.google.com/spreadsheet/ccc?key=0Aqr2nbGXpVFndEZxZHF1OVNJOEtrcVpyLXVmd1QzeEE&usp=drive_web#gid=0">here</a>.<br/><br/>
The test is based on correlations between the average levels of the Eurogenes K13 and the ancient components among selected European populations. Below is a brief description of each of the ancient components.<br/><br/>
<blockquote><b>Early European Farmer (EEF):</b> apparently this is a hybrid component, the result of mixture between "Basal Eurasians" and a WHG-like population possibly from the Balkans. It's based on a 7500 year old Linearbandkeramik (LBK) sample from Stuttgart, Germany, but today peaks at just over 80% among Sardinians.<br/><br/>
<b>West European Hunter-Gatherer (WHG):</b> this ancestral component is based on an 8,000 year old forager from the Loschbour rock shelter in Luxembourg, who belonged to Y-chromosome haplogroup I2a1b. However, today the WHG component peaks among Estonians and Lithuanians, in the East Baltic region, at almost 50%.<br/><br/>
<b>Ancient North Eurasian (ANE):</b> this is the twist in the tale, a component based on a 24,000 year old Upper Paleolithic forager from South Central Siberia, belonging to Y-DNA R*, and known as Mal'ta boy or MA-1. This component was very likely present in Southern Scandinavia since at least the Mesolithic, but only seems to have reached Western Europe after the Neolithic. At some point it also spread into the Americas. In Europe today it peaks among Estonians at just over 18%, and, intriguingly, reaches a similar level among Scots. However, numbers weren't given in the paper for Finns, Russians and Mordovians, who, according to one of the maps, also carry very high ANE, but their results are confounded by more recent Siberian (ENA) admixture.<br/><br/></blockquote>
It's important to note that this test is only likely to be accurate for people of European ancestry, and indeed only those who aren't outliers from the main European clines of genetic diversity. For details of what that means, please consult the aforementioned paper. However, roughly speaking, if you're of European origin and don't score more than 3% East Asian, Siberian, Amerindian, South Asian, Oceanian, Northeast African and/or Sub-Saharan admixture, then you should get a coherent result. Users from the Near East and Caucasus should run the version specifically designed for them, while those from Southeastern Europe might find it useful to run both calculators and then compare the results.<br/><br/>
Thanks to project member DESUK1 for putting this together at such short notice, and MfA for the modified version. Please post your results in the comments section below and state your ancestry when you do. This will help us to improve the accuracy of the test. My results make perfect sense, considering my Polish ancestry.<br/><br/>
<blockquote><b>EEF</b> 42.012706<br/>
<b>WHG</b> 40.52702615<br/>
<b>ANE</b> 17.46026785</blockquote><br/>
This is my interpretation of who these components represent. Of course, this model might change when more ancient genomes are analyzed.<br/><br/>
<blockquote>WHG and WHG/ANE: indigenous European hunter-gatherers<br/>
EEF: mixed European/Near Eastern Neolithic farmers<br/>
ANE/WHG: Proto-Indo-European invaders from the Eastern European steppe<br/>
ENA/ANE: early Uralics from the Volga-Ural region<br/>
EEF/WHG/ANE: late Indo-Europeans (ie. Celts, Germanics and Slavs)</blockquote><br/>
Citation...<br/><br/>
Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., <a href="https://biorxiv.org/content/early/2013/12/23/001552">Ancient human genomes suggest three ancestral populations for present-day Europeans</a>, bioRxiv, Posted December 23, 2013, doi: 10.1101/001552<br/><br/>
See also...<br/><br/>
<a href="https://eurogenes.blogspot.com/2013/12/ancient-human-genomes-suggest-three.html">Ancient human genomes suggest (more than) three ancestral populations for present-day Europeans</a><br/><br/>
<a href="https://eurogenes.blogspot.com/2014/03/ancient-north-eurasian-ane-levels.html">Ancient North Eurasian (ANE) levels across Asia</a><br/><br/></span>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-79955013828521120312013-11-21T17:26:00.002-08:002021-10-22T20:18:59.623-07:00Updated Eurogenes K13 now at GEDmatch<br/><span style="font-size:100%;"><span style="font-family: arial;">The new K13 population averages and genetic (Fst) distances between the inferred ancestral clusters are available <a href="https://docs.google.com/spreadsheets/d/1Oz6P5-SVEJciPX1TciGe-zoqA5JtOGIMG7nh-rCOj0c/edit?usp=sharing">here</a> and <a href="https://drive.google.com/file/d/0B9o3EYTdM8lQV2RJS1RwOW9XLUE/view?usp=sharing&resourcekey=0-ceLbWE9WG7gvHkoFGDLDtg">here</a>, respectively. To find this test at GEDmatch do this:<br/><br/>
<a href="https://gedmatch.com/">GEDmatch</a> > <a href="https://ww2.gedmatch.com:8006/autosomal/ap_mix1_gen.php">Ad-Mix Utilities</a> > Eurogenes > K13</br></br>
Below is a 2D PCA based on the average K13 results of the European and Asian reference populations, courtesy of project member PL16.</br></br>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQa2o4SFRWel9pdnM/view?usp=sharing&resourcekey=0-daUy6cUAADJQgsPa8084tA" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="204" width=400" src="https://lh6.googleusercontent.com/-TUB1lFvV-hs/U8ZkRrP2B9I/AAAAAAAAA8Q/UtYg-puhXG0/w400-h204-no/K13_Europe_Asia_PCA_small.png" /></a></div><br/>
I now have four tests at GEDmatch with Oracles: the Jtest, EUtest, K15 and K13. It's useful to keep in mind that these tests will differ in their interpretation of the data, and perhaps accuracy, depending on the ancestry of the user. For instance, the new K13 should be more useful for Central and South Asians than any of the others, because it features new reference samples relevant to them.<br/></span></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-76840830284939027042013-10-07T04:53:00.002-07:002021-10-22T20:14:55.231-07:00Eurogenes K15 now at GEDmatch<br/>This new test is essentially an upgraded version of the EUtest. Unlike the original, it includes an Amerindian component and five native reference populations from North and Central America. So obviously it should be a lot more useful for users from the New World who are wondering about Amerindian admixture.</br></br>
<a href="https://gedmatch.com/">GEDmatch</a> > <a href="https://ww2.gedmatch.com:8006/autosomal/ap_mix1_gen.php">Ad-Mix Utilities</a> > Eurogenes > Eurogenes EUtestV2 K15</br></br>
I just tried it myself, and have say that the 4-Ancestors Oracle results were impressive. In other words, they were very accurate based on what I know about my recent ancestry. On the other hand, I'd say the default Oracle was picking up more ancient gene flows. However, this might not be the case for everyone, so let's hear some feedback, discuss the outcomes, and perhaps tweak the settings if necessary.</br></br>
One of the most important things to keep in mind is to ignore all results under 1%. These are likely to be noise.</br></br>
The population averages and Fst distances between the ancestral clusters are <a href="https://docs.google.com/spreadsheets/d/19c_bZjUV_RouKyGyLHmMDw57WwAVabXFJOaso_gcuRE/edit?usp=sharing">here</a> and <a href="https://drive.google.com/file/d/0B9o3EYTdM8lQS3VvTUYyYXd0akk/view?usp=sharing&resourcekey=0-OsVWvy5TCJEEBSqPXjkrZw">here</a>, respectively. Below are spatial maps of the main West Eurasian components courtesy of Gui (FR7): Baltic, North Sea, Atlantic, East Euro, West Med, East Med, West Asian.</br></br></br>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQLWRXNllIVFBEUjQ/view?usp=sharing&resourcekey=0-Px1SckdyP6kIHicz9FRxZg" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="191" width=350" src="https://lh5.googleusercontent.com/-SpHGxg9cr0I/U8Zot_3RhdI/AAAAAAAAA84/tcLw6HS4AZQ/w350-h191-no/Baltic_K15_small.png" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQbHV5R0YxcTRJejA/view?usp=sharing&resourcekey=0-3-klkV93z6pLMvAyJx3kQQ" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="198" width=350" src="https://lh3.googleusercontent.com/-SGQpYUKZMdk/U8ZouyT-VSI/AAAAAAAAA9A/pyPC4ghy5Eg/w350-h198-no/North_Sea_K15_small.png" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQa2Q4WmdsWHpBdHc/view?usp=sharing&resourcekey=0-wNetsqI7KjEcKA_Bcxis2g" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="207" width=350" src="https://lh6.googleusercontent.com/-ceJ9vNbit60/U8ZotKTlOvI/AAAAAAAAA8o/hsTqLs5SvyU/w350-h207-no/Atlantic_K15_small.png" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQOXFMakN2cVY0MDQ/edit?usp=sharing" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="189" width=350" src="https://lh4.googleusercontent.com/-zEvSYyBztYo/U8Zot_a3QqI/AAAAAAAAA80/ltcgCBUDLic/w350-h189-no/Eastern_Euro_K15_small.png" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQSlhteUhjckZCVm8/edit?usp=sharing" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="297" width=350" src="https://lh6.googleusercontent.com/-l6wMKdxOFEQ/U8Zoym_QRQI/AAAAAAAAA9Y/bIToZImS28w/w350-h297-no/West_Med_K15_small.png" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQSmc2bzg0OHQwUUE/view?usp=sharing&resourcekey=0-L4zqdhIQaPorcEv4ZdJRGw" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="243" width=350" src="https://lh4.googleusercontent.com/-tU2YIJ_Epgo/U8ZqEche8RI/AAAAAAAAA94/fS9D13K7WXg/w350-h243-no/East_Med_K15_small.png" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQQkFkc0hOazFqU1E/view?usp=sharing&resourcekey=0-VtKvBZ6dX0ohrKbxOxb_YQ" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="222" width=350" src="https://lh5.googleusercontent.com/-wAaZojuveRo/U8Zox-K0pII/AAAAAAAAA9Q/jW3A3rWVLCc/w350-h222-no/West_Asian_K15_small.png" /></a></div><br/></br>
See also...<br/></br>
<a href="https://bga101.blogspot.com/2016/09/orcadians-k15-and-calculator-effect.html">
Orcadians, the K15 and the calculator effect</a><br/></br>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-2714715571600892722013-03-09T17:00:00.002-08:002021-10-22T20:17:14.352-07:00Eurogenes K36 now at GEDmatch<br/><span style="font-size:100%;"><span style="font-family: arial;">I've just put together a new test for GEDmatch called the Eurogenes K36. Obviously, the K36 means that it features thirty six ancestral clusters. It probably won't include any Oracles, mostly because the <a href="https://bga101.blogspot.com/2012/05/beware-calculator-effect.html">Calculator Effect</a> would render these useless if they were based on the average results of the reference samples, and it'd be very time consuming for me to test a wide variety of other samples in supervised mode using thirty six sets of allele frequencies.<br/><br/>
The main purpose of the Eurogenes K36 is to help users unravel the ethnic origins of local areas of their genomes (aka. half-segments), hence the high number of ancestral categories, some of which are very specific. In other words, the test is mainly a chromosome painting utility. It's accessible via the GEDmatch Ad-Mix link below:<br/><br/><br/>
GEDmatch > <a href="https://ww2.gedmatch.com:8006/autosomal/ap_mix1_gen.php">Ad-Mix page</a> > Eurogenes > Eurogenes K36<br/><br/><br/>
An important point to keep in mind is not to take the ancestry proportions too literally. If you're, say, English, and you get an Iberian score of 12% this doesn't actually mean you have recent ancestry from Spain or Portugal. What it means is that 12% of your alleles look typical of the reference samples classified as Iberian, and this figure might only indicate recent Iberian admixture if it's clearly higher than those of other English users.<br/><br/>
Another way to look at it is that the ancestry proportions are like map coordinates, and they'll place you with a very high degree of accuracy on a genetic map featuring other users. Indeed, please feel free to post your scores and ancestry details in the comments below to help others get an idea of what their results might represent. My results are listed below. The scores put me squarely in Poland relative to those of other European samples I've run, which is correct.<br/><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQQkFkc0hOazFqU1E/view?usp=sharing&resourcekey=0-VtKvBZ6dX0ohrKbxOxb_YQ"><img border="0" height="533" width="200" src="https://lh5.googleusercontent.com/-2DwoKjmVuuY/U8ZvL9_pJoI/AAAAAAAAA-U/HPQW-hDJBCA/w200-h533-no/My_K36_small.png" /></a></div><br/>
Also worth mentioning is that this test focuses on much deeper ancestry than the <a href="https://www.23andme.com/you/ancestry/composition/">Ancestry Composition</a> at 23andMe. Hence, I expect that many Europeans will score a few percent in non-European clusters. However, like many ADMIXTURE results, this could give us strong hints about population movements into Europe during prehistory and early history, so it's worth keeping an eye on.</span><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-14689827748377936912012-12-03T22:28:00.003-08:002019-10-23T03:30:03.630-07:004-Ancestors Oracle at GEDmatch<br/><span style="font-size:100%;"><span style="font-family: arial;">The <a href="https://bga101.blogspot.com/2012/09/eurogenes-ashkenazim-ancestry-test-files.html">Jtest and EUtest</a> at GEDmatch now include a new tool called the 4-Ancestors Oracle (aka. Oracle-4), as well as the 3D PCAs I promised earlier. Oracle-4 will attempt to pinpoint your ethnic group of origin, and then also work out the most likely combinations of two, three and four ancestral populations which make up your genome. However, this doesn't mean the results will actually show your ethnic group, or those of your parents (in dual mode) or grandparents (4-way mode). They might for many people, but for others they'll reflect the best possible outcomes from the reference samples available.<br/><br/>
<a href="https://ww2.gedmatch.com:8006/autosomal/ap_mix1_gen.php">GEDmatch Ad-Mix Utilities</a><br/><br/>
Enjoy, and feel free to give feedback to John at GEDmatch if you think it might be useful (but please don't spam his account).</span><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-75926307441314402682012-09-27T22:28:00.002-07:002021-10-22T20:24:49.728-07:00Jtest K14 - the Eurogenes Ashkenazi ancestry test<br/><span style="font-size:100%;"><span style="font-family: arial;"><b>Update 19/03/2018:</b> It's come to my attention that many people are still using the Jtest and taking the results very seriously. Indeed, perhaps too seriously.<br/><br/>
Also, some users are doing weird stuff with the Jtest output in an attempt to estimate their supposedly "true" Ashkenazi ancestry proportions, like multiplying their Ashkenazi coefficient by three, because Ashkenazi Jews "only" score around 30% Ashkenazi in this test. Ouch! Please don't do that!<br/><br/>
Let me reiterate that this test was only supposed to be a fun experiment. It was never meant to be the definitive online Ashkenazi ancestry test. And even as fun experiments with ADMIXTURE go, it's now horribly outdated, and probably useless for anyone with less than 15-20% Ashkenazi ancestry.<br/><br/>
So it might be time to move on. If you really want to confirm your Jewish ancestry, either or both Ashkenazi and Sephardi, then you need to look at much more powerful and sophisticated options. One of these options is the <b>Global25</b> analysis (see <a href="https://bga101.blogspot.com/2017/10/genetic-ancestry-online-store-to-be.html">HERE</a>), which can pick up minor Jewish ancestry of just a few per cent. But it's not free (USD $12), and it's a DIY test that requires a bit of time and effort to get the most out of it. Also, you'd need to send me your autosomal file so that I can estimate your Global25 coordinates. But I can help you get started and even quickly check if you have any hope at all of confirming Jewish ancestry.<br/><br/>
If, for whatever reason, you'd rather not take advantage of the Global25 offer, because, say, you don't want to share your data with me, then it might be an idea to join the Anthrogenica discussion board and ask the experienced members there about other options [<a href="https://anthrogenica.com/forum.php">LINK</a>].<br/><br/>
In any case, whatever you choose to do, please remember the following points, and feel free to share them with others who are still using the Jtest:<br/><br/>
<blockquote><b>- do not multiply your Jtest Ashkenazi score by 3 in an attempt to find your "true" Ashkenazi ancestry proportion, because this won't work for the vast majority of users<br/><br/>
- but do compare your Jtest Ashkenazi score to those of other people of the same or very similar ancestry to yours to get a rough idea whether you might have any Ashkenazi ancestry (the Jtest population averages will be useful for this, see <a href="https://docs.google.com/spreadsheets/d/1XgXrkqivuGCbYocBm_MMbdRd2tX1A68D3bl_wJN3pUM/edit?usp=sharing">here</a>)<br/><br/>
- if you're still not sure what your Jtest results mean, then just focus on your Jtest Oracle-4 output at GEDmatch, and if you don't see AJ at the top of the oracle list, then this is a strong signal that you don't have substantial Ashkenazi ancestry</b></blockquote>
...<br/><br/>
I recently learned that the new Ancestry Painting at 23andMe will include an Ashkenazi reference group. To be honest, I’m not sure there’s much value in using a genetically bottlenecked population of varied biogeographical origins as a reference in such things. Indeed, the Ashkenazi mainly descend from a few hundred founders, but carry Central European, Eastern European, Middle Eastern, African and probably many other admixtures, as evidenced by their genome-wide and uniparental markers.<br/><br/>
That’s quite a problem, because due to their relative inbreeding, they produce strong ancestral clusters in many analyses, like in ADMIXTURE runs. However, these clusters are made up of allele frequencies from a wide range of sources and, paradoxically, it’s the relatively more outbred populations which contributed to the Ashkenazi gene pool at its formative stages that often end up showing Ashkenazi admixture in such tests, despite not having any. I've seen this happen regularly in my experiments with ADMIXTURE and STRUCTURE, and I'm pretty sure I could find an example in a peer reviewed study if I tried.<br/><br/>
That’s just how things work with the algorithms we have available to run these sorts of tests. Nevertheless, since 23andMe is incorporating an Ashkenazi cluster into its new painting, I thought I’d try and come up with an Ashkenazi ancestry test to perhaps get a rough idea of what we might expect. I'm using ADMIXTURE in supervised mode, and basically trying to recreate clusters that have shown up in a variety of fine-scale analyses, including my <a href="https://bga101.blogspot.com/2012/01/eurogenes-north-euro-clusters-phase-2.html">ChromoPainter run</a> of Northern European samples. It’s still a work in progress, but below are links to files that many of you might find useful..<br/><br/>
<blockquote><a href="https://drive.google.com/file/d/0B9o3EYTdM8lQQnE5eDNuSEJ1eFE/view?usp=sharing&resourcekey=0-tJosmwjvLk6GNiYMtInvYw">Jtest K14 files</a><br/><br/>
<a href="https://docs.google.com/spreadsheets/d/1XgXrkqivuGCbYocBm_MMbdRd2tX1A68D3bl_wJN3pUM/edit?usp=sharing">Jtest averages for selected populations</a><br/><br/>
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQZDZjMWN4SzV3cU0/view?usp=sharing&resourcekey=0-n8bAf3n4RvqvPXi99NTJog">EUtest K13 files</a><br/><br/>
<a href="https://docs.google.com/spreadsheets/d/1koOC326xAIhA7XlzhP8HxTeelemMmv4OlOG4UzRzKl0/edit?usp=sharing">EUtest averages for selected populations</a></blockquote><br/>
The Jtest folder contains files that can be used to make an Ashkenazi ancestry test/chromosome painting with 14 Eurasian and African clusters. The EUtest folder contains the same files, except that the Ashkenazi allele frequencies have been removed. It’s useful to cross check results from both tests, mainly to see what’s hiding under the Ashkenazi admixture if it shows up in the Jtest.<br/><br/>
Based on a few test runs today, I’d say that the noise level for the continental clusters is much less than 1%. But it rises to a few per cent for the intra-West Eurasian clusters. In other words, if you’re European, then you might score something like 0.02% in the Sub-Saharan cluster, which basically means 0%. However, you might get around 2% in the Middle Eastern cluster, even though you’re from Central Europe, and you don’t have any recent Middle Eastern ancestry. You can blame various prehistoric and historic migrations into Europe for these seemingly quirky results, and also the fact that Mesolithic Europeans were significantly Eurasian (i.e. Siberian, Amerindian and South Asian-like).<br/><br/>
The Ashkenazi cluster is very similar to the Middle Eastern cluster in that regard. So anyone who gets an Ashkenazi score of around 2-3% either has very distant Jewish ancestry or, more likely, none at all. However, those who show more than 25% membership in that cluster are almost certainly of fully Ashkenazi ancestry, and their genomes peppered with Ashkenazi-specific chromosomal segments.<br/><br/>
There’s really not much difference between 2% and 25%, you might say. In fact, there is if we say there is. As always, the main thing to remember is that these clusters don’t really exist, because genetic variation is clinal, so the cluster names are basically arbitrary and it’s always the relative results that matter.<b>That’s why to really understand what your scores mean, you need to compare them with those of other users.<br/><br/>
Obviously, it's best to compare with people from the same ethnic and/or regional groups. If the Ashkenazi + East Med scores look relatively inflated, that's a sign of recent Ashkenazi ancestry.</b><br/><br/>
Feel free to use the files above for anything you want, except commercial stuff. Please note, I make no guarantees that they’ll provide accurate results for everyone. I might update this post early next week with new and/or additional files and more tips.<br/><br/>
...<br/><br/>
<b>Update 6/10/2012:</b> The Jtest K14 and EUtest K13 will soon be available at GEDmatch, accompanied by an "Oracle" population matching analysis and maybe even a 3D genetic map. If all goes to plan, the population matching test should be able to give a decisive yay or nay to anyone wondering whether they have recent Ashkenazi ancestry.<br/><br/>
By the way, below is a PCA based on the Jtest <a href="https://docs.google.com/spreadsheets/d/1XgXrkqivuGCbYocBm_MMbdRd2tX1A68D3bl_wJN3pUM/edit?usp=sharing">averages</a> for selected populations. It was produced by one of my project members so that we could check the reliability of the 14 "ancestral" components. The samples were classified into clusters based on their highest peaking component. So, for instance, the Scots are in the light blue Atlantic cluster, along with French Basques, because the Atlantic component dominates in both groups. However, overall, they're more similar to other samples than to each other.<br/><br/>
As per above, the plan is that GEDmatch will soon offer a 3D genetic map based on the loadings from this PCA analysis.<br/><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQNUJadEU5T3VDV0E/view?usp=sharing&resourcekey=0-W0TJZm8S8Dex5gi-nAZzFg" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="400" width="400" src="https://lh4.googleusercontent.com/-PGDOAXOsPyo/VBjtDW8kuQI/AAAAAAAABWs/RThrI0BMKOU/s400-no/Jtest_pca12s_small.png" /></a></div><br/>
<b>Update 11/10/2012:</b> The Jtest and EUtest are now on offer at GEDmatch. The quickest way to get there is via this link to the <a href="https://ww2.gedmatch.com:8006/autosomal/ap_mix1_gen.php">Ad-Mix page</a>. Then, from the drop down menus, choose Eurogenes, followed by Jtest.<br/><br/>
First run the Admix test to check whether your Ashkenazi admixture is significantly higher than expected for your part of the world (as per above, Jtest averages for selected populations are available <a href="https://drive.google.com/file/d/0B9o3EYTdM8lQOTJFaUlKWkd5TW8/view?usp=sharing&resourcekey=0-0dQNTpIVgmzSki2BUMIVSg">here</a>). Then move on to the Oracle analysis by pressing the relevant button at the bottom of the page.<br/><br/>
If your Ashkenazi admixture is clearly elevated, and the top 20 single and/or mixed mode Oracle results show AJ (Ashkenazi Jews) as one of your potential matches, then it’s likely you have recent Ashkenazi ancestry.<br/><br/>
Whether that’s the case or not, you can then move on to the Chromosome Painting feature to see where the potential Ashkenazi admixture is located in your genome. It’s useful to cross check the results with those from the Ancestry Finder at 23andMe to assess their accuracy.<br/><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQMHI3bTBad1ctSWM/view?usp=sharing&resourcekey=0-t2rEqA2UD1Gh1sBDeVzhBA" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="188" width="300" src="https://lh4.googleusercontent.com/-yA6UwEbUoCc/VBjtEPz-9gI/AAAAAAAABW0/zqTuhigI20s/w300-h188-no/PL1_Jtest_small.png" /></a></div><br/>
As already mentioned, the EUtest is exactly the same as the Jtest, but with the Ashkenazi allele frequencies taken out. You can use this option to see what’s hiding under your Ashkenazi admixture in the Jtest. To compare your results with those of selected populations from Europe, Asia and Africa, refer to the EUtest averages <a href="https://drive.google.com/file/d/0B9o3EYTdM8lQS1lQYUZIVlNmTlk/view?usp=sharing&resourcekey=0-cwqTqhIFWt6DrwbKV3GFmQ">sheet</a>.<br/><br/>
<b>Please note:</b> it's important to interpret the results with insight. You need to learn how the system works, pay attention to the types of populations that appear in your results, consider carefully why they might be paired with other populations, and of course study the statistics in detail. Expecting a bullseye classification at the top of the Oracle list is likely to lead to major disappointment for many people, simply because I don't have enough samples to represent all of the substructures that exist around the world, especially within countries.<br/><br/>
I’ll try and update both tests in a few weeks, after seeing how successful the whole set up is at predicting Ashkenazi admixture and locating it in the genome. One of the main goals will be to improve the accuracy of the Oracle analysis for everyone, including New World people with Amerindian admixture.<br/><br/>
<b>Update 21/10/2012:</b> Below are spatial maps of a few of the ancestral clusters from the Jtest, courtesy of project member FR7.<br/><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://drive.google.com/file/d/0B9o3EYTdM8lQV1NCLTJhcW9vVHc/view?usp=sharing&resourcekey=0-j-nhM_LSCV2AYgT99PF4Ug" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="541" width="142" src="https://lh5.googleusercontent.com/-2-8g-bCTumo/VA--eORz4ZI/AAAAAAAABVc/lv-ueu_D41M/w142-h541-no/Jtest_maps_small.png" /></a></div><br/>
<b>Update 4/12/2012:</b> The Jtest and EUtest at GEDmatch now include a new tool called the 4-Ancestors Oracle (aka. Oracle-4), as well as the 3D PCAs I promised earlier. Oracle-4 will attempt to pinpoint your ethnic group of origin, and then also work out the most likely combinations of two, three and four ancestral populations which make up your genome. However, this doesn't mean the results will actually show your ethnic group, or those of your parents (in dual mode) or grandparents (4-way mode). They might for many people, but for others they'll reflect the best possible outcomes from the reference samples available.<br/><br/>
Enjoy, and feel free to give feedback to John at GEDmatch if you think it might be useful (but please don't spam his account).</span><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-55948191276290720442012-05-26T23:41:00.001-07:002021-10-22T20:15:45.352-07:00Beware of the "calculator effect"<br/><span style="font-size:100%;"><span style="font-family: arial;">Many people are getting skewed results from so called DIY admixture calculators. For instance, users from the UK often come out much more continental European than they should. Some of them actually believe that this is because they're genetically more Norman or Saxon than the average Brit.<br/><br/>
No, the real reason is what I call the "calculator effect". This is when the algorithm gives different results to people who are part of the ADMIXTURE runs that produced the allele frequencies used by the calculators, than to those who aren't, even though both sets of users are of exactly the same origin, and should expect basically identical results.<br/><br/>
So, is it possible to get around this calculator effect? Yes, people who aren't included in the datasets that produce the allele frequencies used by the calculators shouldn't compare their results to those who are, including the academic references used. They should only compare results to those of other calculator users. On the other hand, members of the various projects who are run as references, should only compare their results to other project members and relevant academic references.<br/><br/>
I've put together a quick experiment to show the "calculator effect" in full force. I ran two intra-North European ADMIXTURE analyses at K=3, Test1 and Test2, and included myself (PL1) only in the former. These tests were almost identical, except for the fact that I wasn't part of the second run. I then tested my genome with calculators made from the allele frequencies from the two runs.<br/><br/>
My calculator results for Test1 were very similar to the results I received from ADMIXTURE, and made perfect sense based on my ancestry. However, the calculator results for Test2 were way off, and basically made me look like a different sample from some other part of Europe. I even managed to score above noise level Far Eastern ancestry in the calculator version of Test2. Please note, however, that all the other individuals received almost identical scores in both tests. The results from the experiment can be seen in the spreadsheet below.<br/><br/>
<a href="https://docs.google.com/spreadsheets/d/1z-4kDvmXVLs3VDfOuVoisGrOcfGKVN_wTMFu6F54Ui8/edit?usp=sharing">Calculator Effect K=3</a><br/><br/>
I have to say I'm disappointed that no one else is talking about the calculator effect, and how to remedy it. I actually designed my Eurogenes ancestry tests for Gedmatch with this problem in mind, by only using academic references to source the allele frequencies. This means that test results for Eurogenes project members and non-members are directly comparable. Perhaps other genome bloggers can eventually do the same?<br/><br/>
See also...<br/><br/>
<a href="https://eurogenes.blogspot.com/2014/10/ancient-genomes-and-calculator-effect.html">Ancient genomes and the calculator effect</a><br/><br/></span>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-50757598399642198382012-04-21T00:32:00.000-07:002019-10-23T03:22:31.009-07:00So who's the most (indigenous) European of us all?<br/><span style="font-size:100%;"><span style="font-family: arial;">Basically, the first map below reveals the answer. It shows the spread of a European specific cluster from a global-wide ADMIXTURE analysis at K=8 (eight ancestral populations assumed), which I call "North European". Thus, genetically, the most European populations are found around the Baltic Sea, and in particular in the East Baltic region. In my genome collection, samples from Lithuania clearly and consistently score the highest percentages in ADMIXTURE clusters specific to Europe. However, I suspect that if I had Latvians with no known foreign ancestry going back more than four generations, they'd come out the "most European". Hopefully we can test that in the near future.<br/><br/>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivKlBLkB9MYOg-Pw0HojsJDIoi3eEScL7Tu_y9lge8FXfBuMKEvBgmCbiWhbDRqoT3cuSQ6cCyPhbiRvAjRNLD6QJIXD9hxEo1gBPc8k9W9KEI-fAmnrPfCSLNTOdxvNvcOQ7FdVh5z7fA/s1600/North_Euro_K8.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivKlBLkB9MYOg-Pw0HojsJDIoi3eEScL7Tu_y9lge8FXfBuMKEvBgmCbiWhbDRqoT3cuSQ6cCyPhbiRvAjRNLD6QJIXD9hxEo1gBPc8k9W9KEI-fAmnrPfCSLNTOdxvNvcOQ7FdVh5z7fA/s300/North_Euro_K8.jpg" data-original-width="1024" data-original-height="760" /></a></div><br/>
Below are the fifteen Eurogenes sample sets that scored the highest levels of membership in the North European cluster. The list only includes groups with five or more individuals present in the analysis, so some populations, like Estonians or Danes, weren't included, even though they easily made the cut. The spreadsheet with all the results from this run can be seen <a href="https://docs.google.com/spreadsheet/ccc?key=0Ato3EYTdM8lQdGhPS3pMaTZwUUhWbTd0S0hnVkM5M3c#gid=0">here</a>. A table of Fst (genetic) distances between the eight clusters is available <a href="https://img855.imageshack.us/img855/637/eurotestk8fst.png">here</a>.<br/><br/>
<blockquote><b>Lithuanians</b> 77%<br/>
<b>Finns</b> 74%<br/>
<b>Belorussians</b> 70%<br/>
<b>Swedes</b> 69%<br/>
<b>Norwegians</b> 68%<br/>
<b>Kargopol Russians</b> 68%<br/>
<b>Russians</b> 68%<br/>
<b>Poles</b> 68%<br/>
<b>Erzya</b> 66%<br/>
<b>Ukrainians</b> 66%<br/>
<b>Moksha</b> 66%<br/>
<b>Orcadians</b> 63%<br/>
<b>HapMap Utah Americans (CEU)</b> 63%<br/>
<b>Irish</b> 63%<br/>
<b>British</b> 62%</blockquote><br/>
So why did I pick the results from K=8, and not some other K, like 2, 10, or 25? Well, it's not possible to evaluate who is more European without a European-specific cluster (ie. modal in Europeans, with a low frequency outside of Europe). Provided that a decent number and range of global and West Eurasian samples are used in the analysis, such clusters begin appearing at around K=5 or K=6, and start breaking up into local clusters from about K=9. I found that runs below K=8 produced European clusters that spilled too generously outside of the borders of Europe. On the other hand, runs above K=8 produced European clusters that weren't representative of enough European groups (ie. too localized). But the European cluster from K=8 was pretty much perfect, and I think that's obvious from the map. In fact, I can hardly believe how well it fits the modern geographic concept of Europe - north of the Mediterranean and west of the Urals. Amazing stuff.<br/><br/>
There are two other clusters that show up across Europe in non-trivial amounts - Mediterranean and Caucasus (see maps below). These can also be thought of as native European clusters, since they've been on the continent for thousands of years. However, their peak frequencies are found in West Asia, so they're not particularly useful signals of European-specific ancestry.<br/><br/>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUExEWRvPHPdGfWJvrKHhL_ZreRZWHgH-fLLvJh2GZNGHnoOPCkf3pwm7bAjzuunSO4DReaEiKwJt9WZ2xsuKRBFHXujOK-XYxjAEmo_pHQLUzGthHON-DvvRHnXX13mvI4AiVdqpICFKV/s1600/Med_K8.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUExEWRvPHPdGfWJvrKHhL_ZreRZWHgH-fLLvJh2GZNGHnoOPCkf3pwm7bAjzuunSO4DReaEiKwJt9WZ2xsuKRBFHXujOK-XYxjAEmo_pHQLUzGthHON-DvvRHnXX13mvI4AiVdqpICFKV/s300/Med_K8.jpg" data-original-width="1024" data-original-height="760" /></a></div><br/>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtV5hZQ3RnR6QWcpSL4ERKRswW1OvEmpHKHpGBVI5NpMcqEgN5YY_HXSht9KFYx9AbJUNDYBSYjSvlcru5ITi9sSobd6eDwrPK-aKfHF9zP4v-wSKQHgb47LiRPDewvaq5fwMJqIuke5Q/s1600/Caucasus_K8.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtV5hZQ3RnR6QWcpSL4ERKRswW1OvEmpHKHpGBVI5NpMcqEgN5YY_HXSht9KFYx9AbJUNDYBSYjSvlcru5ITi9sSobd6eDwrPK-aKfHF9zP4v-wSKQHgb47LiRPDewvaq5fwMJqIuke5Q/s300/Caucasus_K8.jpg" data-original-width="1024" data-original-height="760" /></a></div><br/>
So what do these three clusters show exactly? They represent certain allele frequencies in modern populations, and in fact, these can change fairly rapidly due to admixture, selection, and genetic drift. So claiming that such clusters represent pure ancient populations is unlikely to be true in most cases, if ever. However, I don't think there's anything wrong in saying that, when robust enough, they can be thought of as signals of ancestry from relatively distinct ancestral groups.<br/><br/>
Indeed, anyone who's read up on the prehistory of Europe, knows that there are three general Neolithic archeological waves to consider when trying to untangle the story of the peopling of Europe. These are Mediterranean Neolithic, Anatolian Neolithic and Forest Neolithic (for example, see <a href="https://en.wikipedia.org/wiki/File:Neolithic_expansion.svg">here</a>).<br/><br/>
Mediterranean Neolithic refers to a series of migrations from West Asia via the Mediterranean and its coasts. The areas most profoundly affected by these movements include the islands of Sardinia and Corsica, and the Southwest European mainland. Anatolian Neolithic describes migrations into Europe from modern day Turkey, mostly into the Balkans, but also as far as Germany and France. At the moment, Forest Neolithic of Northeastern Europe is something of a mystery. However, the general opinion is that it was largely the result of native Mesolithic hunter-gatherers adopting agriculture.<br/><br/>
Obviously, it's very difficult to dismiss the correlations between these three broad archeological groups and the European and two European/West Asian clusters produced in my K=8 ADMIXTURE analysis. Is it a coincidence that the Mediterranean cluster today peaks in Sardinia, which has been largely shielded from foreign admixture since the Neolithic, and today forms a very distinct Southern European isolate? Why does the North European cluster show the highest peaks in classic Forest Neolithic territory? And why does the Caucasus cluster radiate in Europe from the southeast, which is where Anatolian farmers had the greatest impact? These can't all be coincidences, and I'm willing to bet that none of them are. I'm convinced that the three clusters from my K=8 run are strong signals from the Neolithic, and the North European cluster also from the Mesolithic.<br/><br/>
Eventually, these issues will be settled with ancient DNA data, in a much more comprehensive way than ever possible using modern genomes. We've already seen some preliminary results, mostly from Mesolithic, Neolithic and Bronze Age sites around Europe, so perhaps it's useful to ask whether my ADMIXTURE analysis and commentary here mirror these early findings? I think they do. For instance, here's an interesting conclusion regarding the East Baltic area from a study on ancient Scandinavian mtDNA by Malmström et al.<br/><br/>
<blockquote>Through analysis of DNA extracted from ancient Scandinavian human remains, we show that people of the Pitted Ware culture were not the direct ancestors of modern Scandinavians (including the Saami people of northern Scandinavia) but are more closely related to contemporary populations of the eastern Baltic region. Our findings support hypotheses arising from archaeological analyses that propose a Neolithic or post-Neolithic population replacement in Scandinavia [7]. <b>Furthermore, our data are consistent with the view that the eastern Baltic represents a genetic refugia for some of the European hunter-gatherer populations.</b></blockquote><br/>
I suppose there will be people wondering why I didn't take Sub-Saharan African, East Asian, and South Asian admixtures into account in my analysis. The reason is that I wasn't looking at which group was most West Eurasian, or Caucasoid. Based on everything I've seen to date, in my own work as well as elsewhere, the most West Eurasian group would probably be the French Basques from the HGDP. However, the differences between them, and certain groups from Northeastern Europe, like Northern Poles and Lithuanians, really wouldn't be that great anyway. I might do a write up about that at some point.<br/><br/><br/>
Credits...<br/><br/>
- Maps by Eurogenes project member FR7<br/><br/>
- Additional stats by Eurogenes project member DESEUK1<br/><br/><br/>
References...<br/><br/>
Helena Malmström et al., <a href="https://www.cell.com/current-biology/abstract/S0960-9822(09)01694-7">Ancient DNA Reveals Lack of Continuity between Neolithic Hunter-Gatherers and Contemporary Scandinavians</a>, Current Biology, 24 September 2009, doi:10.1016/j.cub.2009.09.017<br/><br/>
Noreen von Cramon-Taubadel and Ron Pinhasi, <a href="https://rspb.royalsocietypublishing.org/content/early/2011/02/17/rspb.2010.2678.abstract">Craniometric data support a mosaic model of demic and cultural Neolithic diffusion to outlying regions of Europe</a>, Proc. R. Soc. B published online 23 February 2011, doi: 10.1098/rspb.2010.2678</span></span><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.comtag:blogger.com,1999:blog-580285511478951760.post-70954245006519962302012-02-26T15:24:00.016-08:002019-10-23T03:19:36.913-07:00Genetic substructures within the HapMap CEU sample (and Eurogenes' Northwest Europeans)<br/><span style="font-size:100%;"><span style="font-family: arial;">In this experiment I attempt to characterize more precisely the origins of some of the individuals from the HapMap CEU cohort. These samples are described by the HapMap project as Utah Americans of Western and Northern European descent. But this doesn't seem to be exactly true for at least two of them, who actually come out very Central European in all my tests. Moreover, it's obvious that some of the samples fit nicely into very specific areas of Western and Northern Europe. For instance, at this level of resolution, a few could pass as Irish, and others for Danes or even Swedes. Below is a quick and dirty ADMIXTURE analysis designed specifically for this experiment.</span><br /><br /><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://drive.google.com/file/d/0B9o3EYTdM8lQWlFTVmNVNjZHNTQ/view?usp=sharing" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSDgtZFh58_YH8bZthAsuKLv6mEKfxjj3ljmxLBBCpa4bkFK2-P2mRftt1ohrFymSb2clqxlm4YbtUZNRSKMyS_aI5gzt_BkpoUkJnCsoqVdI8WjkGt9LPfYXW-ZzkLvviCj6twhlsrJZf/s420/CEU6_small.png" /></a></div><br />
<span style="font-family: arial;">Key: Red = Sub-Saharan African, Yellow = Southern European, Green = North-Central European, Aqua = North Atlantic, Blue = Baltic, Pink = East Asian. See </span><a style="font-family: arial;" href="https://docs.google.com/spreadsheet/ccc?key=0Ato3EYTdM8lQdHI3OC1zb3A3LXFBek5hZUJFVVhxamc">spreadsheet</a><span style="font-family: arial;"> for details.</span><br /><br /><br /><span style="font-family: arial;">Based on the K=6 results it's fair to say that at least six of the CEU samples might pass for unmixed Scandinavians, most likely Danes or southern Swedes (NA12003, NA12057, NA12248, NA12249, NA12776 and NA12875). At least five could be confused for Irish or western British samples (NA10850, NA12005, NA12006, NA12386 and NA12812). The two Central European-like Utahns stick out from the CEU set due to their unusually high Baltic scores (NA11917 and NA12286). From the little I know about the CEU samples, I'd say that these two were of eastern or southeastern German origin. But they might have fairly recent ancestry from further east than that. My own MDS analysis (first image below) and a PCA plot from <a href="https://www.cell.com/current-biology/abstract/S0960-9822(08)00956-1">Lao et al. 2008</a> (second image, slightly edited by me to remove article text) confirm that such Scandinavian-like, German-like and Irish-like individuals do exist in the CEU set.</span><br /><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://drive.google.com/file/d/0B9o3EYTdM8lQa2FkMV8yUjNXLUE/view?usp=sharing" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-g_a-TmOOzPN6ihjv78Z-hDN4pXY40MbiuDiSbvY9y0vzhmFyXua1C3byzPkTjZoroGuiCeRSN1T9HFz91O1D0r3MT7MXPXigQYMlzddE4FpdzgnDxScAKSERXO06haAXdYNgetSD365i/s350/CEU_mds_small.png" /></a></div><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://drive.google.com/file/d/0B9o3EYTdM8lQbGFSVWF3ellOcnM/view?usp=sharing" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhC56ai3vkrxIlqPuUdPODgZzGlpm-Eea1ydEO3ryf0iBdxja8bZPuTui3vUbSYAY08JcKfBu1NZSNWZ3x5uy7HoHscuUW1yEAGyW4ln-Rnw9Svgf40Fybn3_YDrt-S1o3KajGTi5AgPCJi/s350/CEU_Lao_small.png" /></a></div><br />
<span style="font-family: arial;">So the CEU set is not a homogeneous one, and carries clear substructures that can be picked up via fairly basic means. However, this doesn't make the CEU samples less valuable, but more so, due to the lack of public access to continental Northwestern European samples. Secondly, the test reveals some interesting information about the genetic substructures within Northwestern Europe. Here are some of my observations: </span><br /><br /><blockquote><span style="font-family: arial;">- Scandinavians often show very high levels of the North-Central European component, and moderately high levels of the North Atlantic component. Many also carry clear amounts of the Baltic component, but, as a rule, lower levels of the Southern European component.</span><br /><br /><span style="font-family: arial;">- Germans mainly differ from the Scandinavians in that they carry the Southern European component at appreciable amounts. They show variable amounts of the Baltic component, with those from eastern Germany carrying the highest levels.</span><br /><br /><span style="font-family: arial;">- Irish project members, especially those from western Ireland, show very high levels of the North Atlantic component, but low levels of the Southern European component. </span><br /><br /><span style="font-family: arial;">- Western British samples, like those from Cornwall or western Scotland, are generally very similar to the Irish, mainly in that they carry the North Atlantic component at high levels. However, they often show somewhat higher levels of the Southern European component. </span></blockquote><br /><span style="font-family: arial;">I'm eventually going to test these classifications of the CEU samples with </span><a style="font-family: arial;" href="https://bga101.blogspot.com/2012/01/eurogenes-north-euro-clusters-phase-2.html">ChromoPainter</a><span style="font-family: arial;">, which is by far the most accurate tool for such things at the moment. Unfortunately, it's also a lot of hard work and computationally intensive, so it might take a few weeks. I do have the allele frequencies from the above ADMIXTURE run, and it is possible to make a stand alone test from them. However, I'm not certain that's a good idea at present, due to the small number of samples involved. It might be worth doing when the right samples swell in number, so I can run a more robust analysis. In particular, I need more people from Ireland, Scotland and Scandinavia.<br /><br />Reference...<br /><br />Oscar Lao et al, <a href="https://www.cell.com/current-biology/abstract/S0960-9822(08)00956-1">Correlation between Genetic and Geographic Structure in Europe</a>, Current Biology, Volume 18, Issue 16, 1241-1248, 26 August 2008, doi:10.1016/j.cub.2008.07.049</span></span><br/><br/>Davidskihttp://www.blogger.com/profile/04637918905430604850noreply@blogger.com