XGSA: A statistical method for cross-species gene set analysis

Djordje Djordjevic, Kenro Kusumi, Joshua W K Ho

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

Motivation: Gene set analysis is a powerful tool for determining whether an experimentally derived set of genes is statistically significantly enriched for genes in other pre-defined gene sets, such as known pathways, gene ontology terms, or other experimentally derived gene sets. Current gene set analysis methods do not facilitate comparing gene sets across different organisms as they do not explicitly deal with homology mapping between species. There lacks a systematic investigation about the effect of complex gene homology on cross-species gene set analysis. Results: In this study, we show that not accounting for the complex homology structure when comparing gene sets in two species can lead to false positive discoveries, especially when comparing gene sets that have complex gene homology relationships. To overcome this bias, we propose a straightforward statistical approach, called XGSA, that explicitly takes the cross-species homology mapping into consideration when doing gene set analysis. Simulation experiments confirm that XGSA can avoid false positive discoveries, while maintaining good statistical power compared to other ad hoc approaches for cross-species gene set analysis. We further demonstrate the effectiveness of XGSA with two real-life case studies that aim to discover conserved or species-specific molecular pathways involved in social challenge and vertebrate appendage regeneration. Availability and Implementation: The R source code for XGSA is available under a GNU General Public License at http://github.com/VCCRI/XGSA.

Original languageEnglish (US)
Pages (from-to)i620-i628
JournalBioinformatics
Volume32
Issue number17
DOIs
StatePublished - Sep 1 2016

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'XGSA: A statistical method for cross-species gene set analysis'. Together they form a unique fingerprint.

Cite this