Towards integrative gene prioritization in Alzheimer's disease

Jang H. Lee; Graciela H. Gonzalez

Towards integrative gene prioritization in Alzheimer's disease

Jang H. Lee, Graciela H. Gonzalez

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.

Original language	English (US)
Title of host publication	Pacific Symposium on Biocomputing 2011, PSB 2011
Pages	4-13
Number of pages	10
State	Published - 2011
Event	16th Pacific Symposium on Biocomputing, PSB 2011 - Kohala Coast, HI, United States Duration: Jan 3 2011 → Jan 7 2011

Publication series

Name	Pacific Symposium on Biocomputing 2011, PSB 2011

Other

Other	16th Pacific Symposium on Biocomputing, PSB 2011
Country/Territory	United States
City	Kohala Coast, HI
Period	1/3/11 → 1/7/11

ASJC Scopus subject areas

Computational Theory and Mathematics
Biomedical Engineering
General Medicine

Cite this

@inproceedings{a095e95f8cf7472fb6a350a30978e2ae,

title = "Towards integrative gene prioritization in Alzheimer's disease",

abstract = "Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.",

author = "Lee, {Jang H.} and Gonzalez, {Graciela H.}",

year = "2011",

language = "English (US)",

isbn = "9814335053",

series = "Pacific Symposium on Biocomputing 2011, PSB 2011",

pages = "4--13",

booktitle = "Pacific Symposium on Biocomputing 2011, PSB 2011",

}

TY - GEN

T1 - Towards integrative gene prioritization in Alzheimer's disease

AU - Lee, Jang H.

AU - Gonzalez, Graciela H.

PY - 2011

Y1 - 2011

N2 - Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.

AB - Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.

UR - http://www.scopus.com/inward/record.url?scp=84863142782&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863142782&partnerID=8YFLogxK

M3 - Conference contribution

C2 - 21121028

AN - SCOPUS:84863142782

SN - 9814335053

SN - 9789814335058

T3 - Pacific Symposium on Biocomputing 2011, PSB 2011

SP - 4

EP - 13

BT - Pacific Symposium on Biocomputing 2011, PSB 2011

T2 - 16th Pacific Symposium on Biocomputing, PSB 2011

Y2 - 3 January 2011 through 7 January 2011

ER -

Towards integrative gene prioritization in Alzheimer's disease

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this