K-shuff: A novel algorithm for characterizing structural and compositional diversity in gene libraries

Kamlesh Jangid, Ming-Hung Kao, Aishwarya Lahamge, Mark A. Williams, Stephen L. Rathbun, William B. Whitman

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.

Original languageEnglish (US)
Article numbere0167634
JournalPloS one
Issue number12
StatePublished - Dec 2016

ASJC Scopus subject areas

  • General Biochemistry, Genetics and Molecular Biology
  • General Agricultural and Biological Sciences
  • General


Dive into the research topics of 'K-shuff: A novel algorithm for characterizing structural and compositional diversity in gene libraries'. Together they form a unique fingerprint.

Cite this