The effects of random taxa sampling schemes in Bayesian virus phylogeography

Daniel Magee, Matthew Scotch

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Public health researchers are often tasked with accurately and quickly identifying the location and time when an epidemic originated from a representative sample of nucleotide sequences. In this paper, we investigate multiple approaches to subsampling the sequence set when employing a Bayesian phylogeographic generalized linear model. Our results indicate that near-categorical posterior MCC estimates on the root can be obtained with replicate runs using 25–50% of the sequence data, and that including 90% of sequences does not necessarily entail more accurate inferences. We present the first analysis of predictor signal suppression and show how the ability to detect the influence of predictor variables is limited when sample size predictors are included in the models.

Original languageEnglish (US)
Pages (from-to)225-230
Number of pages6
JournalInfection, Genetics and Evolution
StatePublished - Oct 2018


  • Phylogeography
  • Selection Bias
  • Viruses

ASJC Scopus subject areas

  • Microbiology
  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics
  • Microbiology (medical)
  • Infectious Diseases


Dive into the research topics of 'The effects of random taxa sampling schemes in Bayesian virus phylogeography'. Together they form a unique fingerprint.

Cite this