Sunday, April 9, 2017

Why We Find Discrepancies in African and Non-African Admixture/Structure Studies

We may never know the admixture between Native Americans and Africans if we wait to get the information from researchers because they are attempting to maintain the status quo.

Discrepancies take place because researchers do not want to tell the truth about the genetic histories of African people and their admixture with Native Americans and Eurasians. As a result, researchers have developed methods to exclude evidence of non-Africans carrying haplogroups mtDNA haplogroups L, and y-Chromosomes E and A.


This is due to the protocols of AdMixture and Structure programs that assume that Native Americans, Europeans and Africans only met after 1492. As a result researchers try to find methods to exclude African presence in European and Native Americans so evidence of this admixture will not be evidenced in the final results. Next researchers claim that if African people carry mtDNA haplogroups: N, R, M and D ; and Y-Chromosomes C, Q, I, J, and R, they are carrying Eurasians haplogroups, eventhough all of these haplogroups are found among African populations that have no history of admixture with Europeans. As a result, these haplogroups are probably of African origin--not a back migration.

Researchers believe this evidence should be excluded because any African admixture among these populations have to be recent.
The best example of how African admixture is excluded in research is Reich, D. et al, Reconstructing Native American population history. Nature 488, 370-374 (2012) Paper web page , the method used to exclude African admixture from this study is detailed in Supplementary Material 1.Reich, D. et al (2012) outlines the motivations for the exclusion of Africans from his study:

quote:
  • (i) Motivation
    There were a number of populations for which we did not have access to unadmixed samples. To learn about the history of such populations, we needed to adjust for the presence of non-Native ancestry. We used three complementary approaches to do this. The concordance of results from all these approaches increases our confidence in the key findings of this study.

    (1) Restricting to unadmixed samples: We restricted some analyses to 163 Native American samples (34 populations) without any evidence of recent European or African admixture (Note S2). A limitation of these studies, however, is that we could not analyze 16 populations in which all individuals were inferred to have some degree of recent admixture.

    (2) Local ancestry masking: We identified segments of the genome in each individual that had an appreciable probability of harboring non-Native American or Siberian ancestry. We then created a “masked” dataset that treated genetic data in these sections as missing (Note S4).

    (3) Ancestry Subtraction: We explicitly corrected for the effect of the estimated proportion of European and African in each sample by adjusting the value of f4-statistics by the amount that is expected from this admixture. This is discussed in what follows.

    (ii) Details of Ancestry Subtraction
    Assume that we have an accurate estimate of African and European ancestry for each sample (whether it is an individual or a pool of individuals). In practice, we used the ADMIXTURE k=4 estimates, because as described below, they appear to be accurate for Native American populations (with the possible exception of Aleuts as we discuss below). We can then define:

    a = % African ancestry in a test sample
    e = % European ancestry in a test sample
    1-a-e = % Native ancestry

    For many of our analyses, we are computing f4 statistics, whose values are affected in a known way by European and African admixture. Thus, we can algebraically correct for the effect of recent European or African admixture on the test statistics, obtaining an “Ancestry Subtracted” statistic that is what is expected for the sample if it had no recent European or African ancestry.

    The main context in which we compute f4 statistics is in our implementation of the 4 Population Test, to evaluate whether the allele frequency correlation patterns in the data are consistent with the proposed tree ((Unadmixed, Test),(Outgroup1, Outgroup2)), where the Unadmixed population is a set of Native American samples assumed to derive all of their ancestry from the initial population that peopled America, the Test population is another Native American population, and the two outgroups are Asian populations. An f4 statistic consistent with zero suggests that the Unadmixed and Test populations form a clade with no evidence of ancestry from more recent streams of gene flow from Asia. If the Test population harbors recent European or African ancestry, however, a significant deviation of this statistic from zero would be expected, making it difficult to interpret the results. We thus compute a linear combination of f4 statistics that is expected to equal what we would obtain if we had access to the Native American ancestors of the Test population without recent European or African admixture:

    S_1=(f_4 (Unadmixed,Test;Out1,Out2)-(a) f_4 (Unadmixed,Yoruba;Out1,Out2)-(e) f_4 (Unadmixed,French;Out1,Out2))/(1-a-e) (S3.1)

    Intuitively, this statistic is subtracting the contribution to the f4 statistic that is expected from their proportion a of West African-like ancestry (Yoruba), and their proportion e of West Eurasian-like ancestry (French). We then renormalize by 1/(1-a-e) to obtain the statistic that would be expected if the sample was unadmixed.

    A potential concern is that the African and European ancestry in any real Native American test sample is not likely to be from Yoruba and French exactly; instead, it will be from related populations. However, S1 is still expected to have the value we wish to compute if we choose the outgroups to be East Asians or Siberians. The reason is that genetic differences between Yoruba and the true African ancestors, and French and the true European ancestors, are not expected to be correlated to the frequency differences between two East Asian or Siberian outgroups. Specifically, the allele frequency differences are due to history within Africa or Europe, which is not expected to be correlated to allele frequency differences within East Asia and within Siberia.

    (iii) Ancestry Subtraction gives results concordant with those on unadmixed samples
    To compare the performance of our three approaches to address the confounder of recent European and African admixture, we computed 48 = 8×6 statistics of the form f4(Unadmixed, Test; Han, San). We choose “Unadmixed” to be one of 8 Native American groups from Meso-America southward that have sample sizes of at least two and for which all samples are inferred to be unadmixed by ADMIXTURE k=4 (Chane, Embera, Guahibo, Guaymi, Karitiana, Kogi, Surui and Waunana). We choose “Test” to be one of 8 Native American populations from Meso-America southward with at least two samples that are entirely unadmixed, and that also have at least two samples that have >5% non-Native admixture according to the ADMIXTURE k=4 analysis (Aymara, Cabecar, Pima, Tepehuano, Wayuu and Zapotec1). This allows us to compare results on admixed and unadmixed samples from the same population.

    If the Test population harbors European or West African admixture that we have not corrected, we expect to see a significant deviation of the statistic from zero. For example, f4(Karitiana, French; Han, San), corresponding to the statistic expected for an entirely European-admixed Native American population, is significant at Z = 45 standard errors from zero, and f4(Karitiana, Yoruba; Han, San), which gives the f4-value we would expect for an entirely West African-admixed Native American population, is significant at Z = 101.

    Figure S3.1 shows the scatterplots of Z-scores we obtain without Ancestry Subtraction, with Ancestry Subtraction, and with local ancestry masking (Note S4). The x-axis shows data for the unadmixed samples from each Test population, while the y-axis shows the results for the >5% admixed samples from the same populations. We find that:
    • Without Ancestry Subtraction there are significant deviations from zero (|Z|>3) (Fig. S3.1A)
    • With Ancestry Subtraction, there are no residual |Z|-scores >3 (Figure S3.1B)
    • With local ancestry masking (Note S4), there are again no residual |Z|-scores >3 (Figure S3.1C), showing that this method also appears to be appropriately correcting for the admixture.


Given the exclusion of Africans from studies like Reich, D. et al (2012), means that we are not really knowing the actual admixture among Africans and Native American that carry the accepted African haplogroups: i.e., haploroups E , L and etc.

No comments: