Admixture Calculators

Click HERE to go to the GEDmatch main site login page.


Below are detailed notes on some Admixture Calculators from Gedmatch.com - for some reason, pages from Gedmatch’s wiki won’t load unless you’re logged in and open the wiki from within Gedmatch. I have copied the info here so I can link to it from my blog. Some of these calculators are no longer available.

Admixture Calculators

Eurasia K3 - E Eurasian, W Eurasian, and Sub-Saharan African Calculator

Eurasia K9 - Ancestral South Indian and Caucuses Hunter Gatherer Calculator

Eurasia K10 - Caucuses Hunter Gatherers Calculator

Eurasia K11 - Caucuses Hunter Gatherer and Neolithic Anatolian Farmer Calculator

Eurasia K14 - Neolithic Calculator

Eurasia K15 - Indian Subcontinent Calculator

Gedrosia K11 - Kalash Calculator

Gedrosia K12 - South Asian and West Asian Ancestry

PuntDNAL K15 - African Ancestry Calculator

MDLP K13 Ultimate - Deep Origins of Populations Calculator

Eurasia K3 - E Eurasian, W Eurasian, and Sub-Saharan African Calculator

This calculator calculates an individual's E Eurasian, W Eurasian, and Sub-Saharan African admixture.

The components are defined as follows:

1- E Eurasian - This component peaks in E & SE populations such as Ami, Nivkh, Dai, Han, and Ulchi, at about 100%, followed by Siberian & other Asian populations such as Nganasans, Tibetans, Subba, and Mongola.

2- W Eurasian - This component peaks in Neolithic European farmers such as Stuttgart, and LBK culture, as well as in most modern European populations at over 95%.

3- SSA (Sub-Saharan African) - This component peaks in Sub-Saharan African populations such as Yoruban, Esan, and Luhiya at over 97%.

For further information, contact the project creator at dilawerkh4@gmail.com

Eurasia K9 - Ancestral South Indian and Caucuses Hunter Gatherer Calculator

This calculator is modeled around the ancient Ancestral South Indian component, ASI, which peaks in the Onge, Andamanese, and a few Indian tribal populations. The Onge are believed to have taken the coastal migration route from Arabia to India over 40,000 years ago.

The Onge together with the other Andamanese people that inhabit the Andaman Islands, which are governed by India, are an endangered species, and number only in the few hundreds.

The populations used to source ASI allele frequencies have much less W Eurasian admixture than the majority of present day Indians. These populations include Paniya, Kharia, Ho, and Puliyar.

To increase the accuracy of the calculator, a common denominator set of SNPs for the sample genomes was used. This common denominator set of SNPs turned out to be around 33,000, which is more than adequate because the fixation distances between the calculator components is relatively high.

To prevent a W Eurasian heavy Indian cluster from forming, I did not include the multitude of Indian populations that I would normally include, in this run. Consequently,Indians may find their oracle distances to be high, since the majority of Indian proxy groups are missing from this calculator.

Here is a brief description of the components:

1- WHG peaks in Loshbour, La Brana, Bichon, and

2- EHG peaks in Karelia HG

3- ASI peaks in the Onge and other Andamanese, followed by Indian tribals such as Puliyar, Ho and Paniya

4- Siberian / E Asian peaks in Nganasan followed by Ulchi

5- SW Asian peaks in Bedouin B followed by Saudi

6- ENF peaks in Neolithic Anatolians and LBK cultures of the Balkans

7- W African peaks in Esan and Yoruba

8- SE Asian peaks in Ami, Dai, followed by Naga

9- CHG peaks in Satsurbila CHG, followed by Kotias CHG, and the Baloch & Brahui of Pakistan

For questions, contact the project creator at Dilawerkh4@gmail.com

Eurasia K10 - Caucuses Hunter Gatherers Calculator

The focus of this calculator are about 47 recently sequenced ancient genomes described in Jones et al, "Upper Palaeolithic genomes reveal deep roots of modern Eurasians" at [1], and in "Genome-wide patterns of selection in 230 ancient Eurasians" at [2]

It appears that the Caucuses Hunter Gatherers (CHG) are the mystery W Asian population described in Lazaridis et al, "Massive migration from the steppe was a source for Indo-European languages in Europe" [3], as having contributed to the Yamnaya Steppe Herders of Eurasia along with Eastern Hunter Gatherers (EHG).

This calculator uses two CHG genomes from Georgia in the Caucuses; a mesolithic (9700 year old) named Kotias, and an upper paleolithic male (13300 years old), named Satsurbila, to source transversion SNP allele frequencies. It also uses the genome of a 13700 year old upper paleolithic male named Bichon from Switzerland. This is in addition to other ancient genomes, and 1000+ individuals from various regions of the world.

Here is a description of the calculator's components:

Anatolian Farmer: This component is based on sequenced neolithic farmers who likely introduced farming into Europe from the near east. The ancient genomes Stuttgart and Otzi score high here, as well as modern Sardinians.

EHG: This component is based on two 7500 year old Karelian hunter gatherers from the Russian steppe. The 24000 year old genome MA1, and the Yamanya Steppe Herders also score high in this component.

S Indian: Many S Indian tribal populations were used to source allele frequencies for this component. It peaks in the Pulliyar.

Amerindian: This is based on the Karitiana of S America.

W African: The Yoruba & Esan of W Africa were used to source allele frequencies for this component.

Papuan: Based on Papuans of New Guinea.

SW Asian: Peaks in the Bedouin, Saudis, and other SW Asian populations

E Asian: This component includes signals modal to the Ulchi of far Eastern Russia, Nganasans of Siberians and Ami of SE Asia.

CHG: This component is the focus of this calculator. It is modal to Satsurbila and Kotias. Balochis, Brahuis, and Makranis of SC Asia score the highest from modern populations followed by caucuses groups such as Abkhasians, Adygei, and Georgians. Iranic groups such as the Kurds also score high in this component.

WHG: Western European Hunter Gatherer; This component is modal to the ancient genomes Loschbour, Bichon, and La Brana, who score >97%.

For further information, contact the project creator at dilawerkh4@gmail.com

Eurasia K11 - Caucuses Hunter Gatherer and Neolithic Anatolian Farmer Calculator

The focus of this calculator are about 47 recently sequenced ancient genomes described in Jones et al, "Upper Palaeolithic genomes reveal deep roots of modern Eurasians" at [4], and in "Genome-wide patterns of selection in 230 ancient Eurasians" at [5]

It appears that the Caucuses Hunter Gatherers (CHG) are the mystery W Asian population described in Lazaridis et al, "Massive migration from the steppe was a source for Indo-European languages in Europe" [6], as having contributed to the Yamnaya Steppe Herders of Eurasia along with Eastern Hunter Gatherers (EHG).

This calculator uses two CHG genomes from Georgia in the Caucuses; a mesolithic (9700 year old) named Kotias, and an upper paleolithic male (13300 years old), named Satsurbila, to source transversion SNP allele frequencies. It also uses the genome of a 13700 year old upper paleolithic male named Bichon from Switzerland. This is in addition to other ancient genomes, and 1000+ individuals from various regions of the world.

Here is a description of the calculator's components:

Neolithic Anatolian Farmer: This component is based on sequenced neolithic farmers who likely introduced farming into Europe from the near east. The ancient genomes Stuttgart and Otzi score high here, as well as modern Sardinians.

EHG: This component is based on two 7500 year old Karelian hunter gatherers from the Russian steppe. The 24000 year old genome MA1, and the Yamanya Steppe Herders also score high in this component.

S Indian: Many S Indian tribal populations were used to source allele frequencies for this component. It peaks in the Pulliyar.

Amerindian: This is based on the Karitiana of S America.

W African: The Yoruba of Nigeria were used to source allele frequencies for this component.

Papuan: Based on Papuans of New Guinea.

E African: Peaks in the Somalis and Masai of E Africa.

E Asian: Peaks in the Ulchi of far Eastern Russia. Siberians and SE Asians also score very high here.

CHG: This component is the focus of this calculator. It is modal to Satsurbila and Kotias. Balochis, Brahuis, and Makranis of SC Asia score the highest from modern populations followed by caucuses groups such as Abkhasians, Adygei, and Georgians. Iranic groups such as the Kurds also score high in this component.

Kalash: This is modal to the Kalash of the mountains of N Pakistan. Their physical, religious, and cultural isolation from their neighbors has maintained their relative genetic homogeneity. They appear to have originated from the Caucuses region sometime during the Neolithic.

WHG: Western European Hunter Gatherer; This component is modal to the ancient genomes Loschbour, Bichon, and La Brana, who score 99%.

For further information, contact the project creator at dilawerkh4@gmail.com

Eurasia K14 - Neolithic Calculator

This calculator's focus is calculation of admixture proportions and modeling of an individual based on a collection of ancient Neolithic and Bronze Age genomes from across Eurasia.

To counteract the effect of transition SNP substitutions caused by deamination of cytosine during amplification and sequencing of ancient genomes, I have removed transition SNPs from the dataset genomes to allow for allele frequency sourcing based on transversion SNPs only. This makes the comparison of the individual's genome to the ancient genome more accurate, and negates some of the recent genetic drift in modern populations.

I have intentionally not included any Caucasus based populations, to allow for a better estimation of the genetic contribution of Bronze Age Eurasian Steppe Pastoralists to the genomes of modern South and West Asians. The removal of the Caucasus populations also allows for a better estimation of shared ancestry with the isolated and drifted Kalash of the Hindu Kush region of Pakistan.

A brief explanation of the calculator's components:

1- Neolithic Balkan Farmers: This component is modal to the 7000 year old Starcevo and LBK farming cultures of the Balkans. These communities represented the earliest farming communities in Europe, and are believed to have originated in the Near East. This component peaks at 99% in ancient genomes from those cultures and in Tyrolean Otzi.

2- Early European Farmers: This component was either created with the expansion of the Neolithic Balkan Farmers into mainland Europe, or while a different wave of farmers emigrated from the Near East into Europe. It is a hybrid Neolithic Farmer/ West European Hunter Gatherer component, and is well represented in an ancient genome recovered near Stuttgart Germany, and in modern Sardinians and other southern Europeans.

3- Afansievo_Yamnaya: This peaks at 99% in the Afansievo and Yamnaya Steppe Pastoralist cultures of the Eurasian Steppe in Russia

4- SW Asian: This is modal in Bedouin where it peaks at 99%. It is also very well represented by populations such as Saudis and Yemenis

5- Kalash: Peaks at 99% in the drifted and isolated Kalash of the Hindu Kush mountains of Pakistan

6- E African: Peaks in Somali and Masai

7- Siberian: Peaks at 99% in Nganasan

8- S Amerindian: Peaks at 99% in Karitiana

9- N Amerindian: Peaks at 99% in Pima

10-SE Asian: Peaks at 99% in Ami

11- SHG-WHG: Peaks at 99% in Scandanavian Hunter Gatherers such as Motala, and also at 99% in La Brana

12- S Indian: Peaks in S Indian tribals such as Pulliyar at 86%

13- Papuan: Peaks at 99% in Papuans

14- Sub-Saharan: Peaks at 99% in Yoruba and Esan

For further information, contact the project creator at dilawerkh4@gmail.com

Eurasia K15 - Indian Subcontinent Calculator

This calculator's focus is the Indian subcontinent. Many different Indian populations from various merged datasets were used to source allele frequencies, from both transition, as well as transversion SNPs. South and West Asians may notice that their S Asian or S Indian component score is lower here, than in some other calculators. The reason is I have broken up S Asian/ S Indian into its various sub-components; Burusho, Kalash, Balochi, NE Indian Tribal, and Paniya.

Here is a brief description of the calculator's components:

1- SW Asian - Peaks in Saudis & Yemenite Jews

2- SE Asian- This component is modal in the Ami tribe of Taiwan, who speak an Austronesian language

3- E Asian - This component peaks in the Ulchi and Nivkh peoples of Russia's far east, followed by the Yukagir of E Siberia

4- Siberian - This component modal the in the Nganasans of N Siberia, who are believed to be descendants of Paleo-Siberians. It is also high in the Yukagir of E Siberia

5- WHG- West European Hunter Gatherer peaks in various NE European populations

6- Early European Farmers (EEF)- Represents the expansion of farming communities from the near east into Europe during the Neolithic. It peaks in Sardinians

7- Balochi- This component is modal in the Balochi, Brahui, and Makrani populations of Pakistan

8- Burusho - This component peaks in the Burusho of N Pakistan

9- Kalash - This component is modal in the isolated and drifted Kalash people of the Hindu Kush mountains of Pakistan

10- Caucuses - This component peaks in Abkhasians, Georgians, and Adygei

11- NE_Indian_Tribal - This component peaks in:

a- Bhumij, who are a tribal Adivasi people living primarily in the Indian states of Assam, West Bengal, Odisha and Jharkhand. They are speakers of an Austroasiatic language.

b- Birhor, who are a tribal Adivasi forest people, traditionally nomadic, living primarily in the Indian state of Jharkhand. They also are speakers of an Austroasiatic language.

c- Kharia, who are a tribal Adivasi people, inhabiting Bihar, Madhya Pradesh, Odisha and West Bengal, with some in Assam and the Andaman islands. They also speak an Austroasiatic language.

d- Gond, who are a Dravidian people of central India, spread over the states of Madhya Pradesh, Uttar Pradesh, Telangana, Andhra Pradesh and Western Odisha.

12- Paniya - This component peaks in:

a- Paniya, who are a scheduled Indian tribe, inhabiting primarily the state of Kerala. They speak a Dravidian language

b- It is also high in the Kattunayakkan, who are an ancient scheduled tribe in the Indian states of Andhra Pradesh, Karnataka, Kerala and Tamil Nadu. They are one of the earliest known inhabitants of the Western Ghats, who are engaged in the collection and gathering of forest produce, mainly wild honey and wax.

13- Onge - This component is modal to Onge, who are a scheduled Indian tribe of the Andaman Islands. They are on the brink of extinction with less than 200 surviving. They are believed to have split from the mainland Asia "Out-of-Africa" migrations around 48,000 years ago.

13-E African - This component peaks in Somalis and the Masai

14-W African - This component peaks in the Yoruba and Esan of W Africa

For further information, contact the project creator at dilawerkh4@gmail.com

Gedrosia K11 - Kalash Calculator

The Kalash Indo_European peoples of Pakistan are one of the world's most genetically and culturally isolated ethnic groups. Their ancient non-islamic Rigvedic type religion, combined with their physical isolation in the Hindu Kush mountains, have contributed to maintaining their relatively "unmixed" gene pool for the past several thousand years.

Genetically, they appear to be an early Indo-European offshoot, and according to tests that I have conducted using Admixture, they appear to be approximately 60% Caucasus derived, and 40% S Asian derived, and seem to lack the Caucasus derived Gedrosian signal, indicating that their Caucasus derived admixture pre-dates the differentiation of the Caucasus into Caucasus vs. Gedrosian component.

With more formal statistical tests that tend to ignore recent genetic drift that I have performed, they appear to be genetically close to NE European and Caucasus populations.

This calculator is most accurate for individuals with predominantly S Asian or W Asian Ancestry. It is least accurate for individuals with predominantly African or Native American ancestry. Since I have not used African populations to source allele frequencies, Africans will appear predominantly SW Asian.

The calculator's strongest points are the Gedrosian and Kalash signals, although the Kalash shared ancestry percentage obtained here will tend to be an underestimate, as the calculator does not account for the Kalash recent genetic drift.

This calculator's 11 components peak as follows:

1- WHG (W European Hunter Gatherer) - Loushbour & NE Europeans;

2- S Indian - Various S Indian tribal populations, such as Hakkipikki and Nihali;

3- Gedrosian - The Baloch, Brahui, and Makrani of Pakistan;

4- SW_Asian - Saudis, Yemenis, and Bedouin;

5- Siberian - Nganasans;

6- EEF ( Early European Farmers) - LBK, Sardinians, and Stuttgart;

7- E Asian - Ulchis;

8- Caucasus - Georgian, Abkhasians, Adygei, and Balkar;

9- Kalash - Kalash of Pakistan;

10- Indo-Chinese - Kusunda peoples;

11- SE Asian - Ami & Dai.

For further information, contact the project creator at dilawerkh4@gmail.com

Gedrosia K12 - South Asian and West Asian Ancestry

This calculator has been designed for individuals of predominantly South Asian and West Asian ancestry for inferring gedrosian Balochi admixture. Since those populations were mostly used to source allele frequencies, individuals with majority ancestry from outside those regions will most likely find this calculator less accurate and informative.

Many Indian tribal populations were used to source S. Indian allele frequencies. Although the West Asian populations used are adequate, an update may be released in the future which will include a few more W Asian populations.

The Balochi signal peaks in the Balochi/Brahui/Makrani populations of Pakistan.

The Bronze age Sintashta Steppe Herder signal in this calculator reflects genetic Eurasian steppe admixture in excess of what is included in the Caucasus or Balochi signals.

The genotype rate has been optimized for 23andMe users, users genotyped with FTDNA or Ancestry DNA will have slightly lower accurate results than 23andMe users.

For further information, contact the project creator at dilawerkh4@gmail.com

PuntDNAL K15 - African Ancestry Calculator

Fore more information see Anthrogenica

Results are only meaningful for persons who have 100% African ancestry.

The components are defined as follows:

PuntDNAL15.jpg

MDLP K13 Ultimate - Deep Origins of Populations Calculator

How it was developed

The final "cleaned" dataset (2250 samples and 155140 SNPs) laid down the foundations of beta-version of MDLP K13 'Ultimate', which is specially designed for the analysis of "deep" origins of populations. The new model of the calculator is based on the design of DIYDodecad calculator. Although the model clustering model using Mclust algorithm) gives good reason to believe that a dataset of 2,230 genomes is best (ie, without the degeneration of components inevitable for larger K) described by the model of the 8 clusters, I ran ADMIXTURE (in supervised mode) with K=11, i.e., 11 clusters. Then I took .P file of allele frequencies in the inferred 11 "components" and used it to create 11 simulated ancestral populations.

After this procedure, I isolated:

1) ANE component from North-Eurasian component by interpolating the non-East-Asian part of Native Americans' ancestry.

2) Caucasian component by deducing "non-steppe" part of Yamnaya's ancestry. Once again, I created "dummy individuals" for both isolated components, merged them with rest of dataset and ran ADMIXTURE K13 in supervised mode

The components are defined as follows:

Amerindian - the modal component of the Native American

ANE - the modal component of the Northern Eurasians, which has been isolated from the common cluster with WHG - the highest values ​​in the samples of MA1, AG2, as well as the ancient genomes from Sintashta, Andronov, Afanasievo, Yamnaya, Corded Ware etc. Among the modern populations the highest percentage of ANE has been detected in Kalash population. Almost the same with the ANE component in Lazaridis et al. 2014

Arctic - modal component with peak populations Koryak, Chukchi, Eskimos and Itelmens

ASI - еру modal component of South Indian populations (i assume that this component is identical to ASI in (Reich et al. 2009).

Caucasus-Gedrosia - identical to Pontikos's Caucasus-Gedrosia cluster

East Asian - the modal component of East Asia

ENF - the component of the ancient European Neolithic Farmers with the peak in the ancient samples of LBK culture (Lazaridis et al. 2014, Haak et al. 2015). Among the modern populations - the highest values ​​have been detected in Sardinians, Corsicans and Basques.

Near East - the modal component of Middle Easterners

Oceanian - the modal component of the aboriginal inhabitants of Oceania, Austronesian, Melanesia and Micronesia(the peak in modern Papuans and Australian Aborigines)

Paleo-African - the modal component of African Pygmies and Bushmen

Siberian - the modal component of southeastern Siberia

Sub saharan - the second African component (Mandinka, Yoruba and Esan)

WHG-UHG - the native component of the ancient European Mesolithic hunter-gatherers (Lazaridis et al. 2014, Haak et al. 2015). Among the modern populations - the highest percentage in the population of Estonians, Lithuanians, Finns and others.