TY - JOUR
T1 - K-shuff
T2 - A novel algorithm for characterizing structural and compositional diversity in gene libraries
AU - Jangid, Kamlesh
AU - Kao, Ming-Hung
AU - Lahamge, Aishwarya
AU - Williams, Mark A.
AU - Rathbun, Stephen L.
AU - Whitman, William B.
N1 - Funding Information:
This work was supported by the Franklin College of the University of Georgia and grants from the USDA and NSF. We thank Mohit Navandar and Rajesh Jangid for helping with the Perl script. KJ acknowledges the funding support of the Department of Biotechnology, Government of India, under the project entitled "Establishment of Microbial Culture Collection" (Grant no. BT/PR/ 0054/NDB/52/94/2007). We thank Mohit Navandar and Rajesh Jangid for helping with the Perl script.
Publisher Copyright:
© 2016 Jangid et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2016/12
Y1 - 2016/12
N2 - K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.
AB - K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.
UR - http://www.scopus.com/inward/record.url?scp=85000926657&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85000926657&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0167634
DO - 10.1371/journal.pone.0167634
M3 - Article
C2 - 27911946
AN - SCOPUS:85000926657
SN - 1932-6203
VL - 11
JO - PloS one
JF - PloS one
IS - 12
M1 - e0167634
ER -