TY - JOUR
T1 - Genetic differentiation between and within Northern Native American language groups
T2 - an argument for the expansion of the Native American CODIS database
AU - Weise, Jessica A.
AU - Ng, Jillian
AU - Oldt, Robert F.
AU - Viray, Joy
AU - McCulloh, Kelly L.
AU - Smith, David Glenn
AU - Kanthaswamy, Sreetharan
N1 - Funding Information:
This study was funded by a National Institute of Justice grant [grant number 2014-DN-BX-K024] to Sreetharan Kanthaswamy, and a research grant from the UC Davis Forensic Science Graduate Program to Jessica A. Weise.
Publisher Copyright:
© 2021 The Author(s). Published by Taylor & Francis Group on behalf of the Academy of Forensic Science.
PY - 2022
Y1 - 2022
N2 - The National Research Council recommends that genetic differentiation among subgroups of ethnic samples be lower than 3% of the total genetic differentiation within the ethnic sample to be used for estimating reliable random match probabilities for forensic use. Native American samples in the United States’ Combined DNA Index System (CODIS) database represent four language families: Algonquian, Na-Dene, Eskimo-Aleut, and Salishan. However, a minimum of 27 Native American language families exists in the US, not including language isolates. Our goal was to ascertain whether genetic differences are correlated with language groupings and, if so, whether additional language families would provide a more accurate representation of current genetic diversity among tribal populations. The 21 short tandem repeat (STR) loci included in the Globalfiler® PCR Amplification Kit were used to characterize six indigenous language families, including three of the four represented in the CODIS database (i.e. Algonquian, Na-Dene, and Eskimo-Aleut), and two language isolates (Miwok and Seri) using major population genetic diversity metrics such as F statistics and Bayesian clustering analysis of genotype frequencies. Most of the genetic variation (97%) was found to be within language families instead of among them (3%). In contrast, when only the three of the four language families represented in both the CODIS database and the present study were considered, 4% of the genetic variation occurred among the language groups. Bayesian clustering resulted in a maximum posterior probability indicating three genetically distinct groups among the eight language families and isolates: (1) Eskimo, (2) Seri, and (3) all other language groups and isolates, thus confirming genetic subdivision among subgroups of the CODIS Native American database. This genetic structure indicates the need for an increased number of Native American populations based on language affiliation in the CODIS database as well as more robust sample sets for those language families. Supplemental data for this article is available online at https://doi.org/10.1080/20961790.2021.1963088.
AB - The National Research Council recommends that genetic differentiation among subgroups of ethnic samples be lower than 3% of the total genetic differentiation within the ethnic sample to be used for estimating reliable random match probabilities for forensic use. Native American samples in the United States’ Combined DNA Index System (CODIS) database represent four language families: Algonquian, Na-Dene, Eskimo-Aleut, and Salishan. However, a minimum of 27 Native American language families exists in the US, not including language isolates. Our goal was to ascertain whether genetic differences are correlated with language groupings and, if so, whether additional language families would provide a more accurate representation of current genetic diversity among tribal populations. The 21 short tandem repeat (STR) loci included in the Globalfiler® PCR Amplification Kit were used to characterize six indigenous language families, including three of the four represented in the CODIS database (i.e. Algonquian, Na-Dene, and Eskimo-Aleut), and two language isolates (Miwok and Seri) using major population genetic diversity metrics such as F statistics and Bayesian clustering analysis of genotype frequencies. Most of the genetic variation (97%) was found to be within language families instead of among them (3%). In contrast, when only the three of the four language families represented in both the CODIS database and the present study were considered, 4% of the genetic variation occurred among the language groups. Bayesian clustering resulted in a maximum posterior probability indicating three genetically distinct groups among the eight language families and isolates: (1) Eskimo, (2) Seri, and (3) all other language groups and isolates, thus confirming genetic subdivision among subgroups of the CODIS Native American database. This genetic structure indicates the need for an increased number of Native American populations based on language affiliation in the CODIS database as well as more robust sample sets for those language families. Supplemental data for this article is available online at https://doi.org/10.1080/20961790.2021.1963088.
KW - Forensic sciences
KW - Native Americans
KW - North America
KW - languages
KW - population genetics
KW - short tandem repeats (STRs or microsatellites)
UR - http://www.scopus.com/inward/record.url?scp=85115223400&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115223400&partnerID=8YFLogxK
U2 - 10.1080/20961790.2021.1963088
DO - 10.1080/20961790.2021.1963088
M3 - Article
AN - SCOPUS:85115223400
SN - 2096-1790
VL - 7
SP - 662
EP - 672
JO - Forensic Sciences Research
JF - Forensic Sciences Research
IS - 4
ER -