Genetic differentiation between and within Northern Native American language groups: an argument for the expansion of the Native American CODIS database

Jessica A. Weise, Jillian Ng, Robert F. Oldt, Joy Viray, Kelly L. McCulloh, David Glenn Smith, Sreetharan Kanthaswamy

Research output: Contribution to journalArticlepeer-review


The National Research Council recommends that genetic differentiation among subgroups of ethnic samples be lower than 3% of the total genetic differentiation within the ethnic sample to be used for estimating reliable random match probabilities for forensic use. Native American samples in the United States’ Combined DNA Index System (CODIS) database represent four language families: Algonquian, Na-Dene, Eskimo-Aleut, and Salishan. However, a minimum of 27 Native American language families exists in the US, not including language isolates. Our goal was to ascertain whether genetic differences are correlated with language groupings and, if so, whether additional language families would provide a more accurate representation of current genetic diversity among tribal populations. The 21 short tandem repeat (STR) loci included in the Globalfiler® PCR Amplification Kit were used to characterize six indigenous language families, including three of the four represented in the CODIS database (i.e. Algonquian, Na-Dene, and Eskimo-Aleut), and two language isolates (Miwok and Seri) using major population genetic diversity metrics such as F statistics and Bayesian clustering analysis of genotype frequencies. Most of the genetic variation (97%) was found to be within language families instead of among them (3%). In contrast, when only the three of the four language families represented in both the CODIS database and the present study were considered, 4% of the genetic variation occurred among the language groups. Bayesian clustering resulted in a maximum posterior probability indicating three genetically distinct groups among the eight language families and isolates: (1) Eskimo, (2) Seri, and (3) all other language groups and isolates, thus confirming genetic subdivision among subgroups of the CODIS Native American database. This genetic structure indicates the need for an increased number of Native American populations based on language affiliation in the CODIS database as well as more robust sample sets for those language families. Supplemental data for this article is available online at

Original languageEnglish (US)
Pages (from-to)662-672
Number of pages11
JournalForensic Sciences Research
Issue number4
StatePublished - 2022
Externally publishedYes


  • Forensic sciences
  • Native Americans
  • North America
  • languages
  • population genetics
  • short tandem repeats (STRs or microsatellites)

ASJC Scopus subject areas

  • Analytical Chemistry
  • Pathology and Forensic Medicine
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Anthropology
  • Physical and Theoretical Chemistry
  • Psychiatry and Mental health


Dive into the research topics of 'Genetic differentiation between and within Northern Native American language groups: an argument for the expansion of the Native American CODIS database'. Together they form a unique fingerprint.

Cite this