Refining literature curated protein interactions using expert opinions

Oznur Tastan, Y. Yanjun, Jaime G. Carbonell, Judith Klein-Seetharaman

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations


The availability of high-quality physical interaction datasets is a prerequisite for system-level analysis of interactomes and supervised models to predict protein-protein interactions (PPIs). One source is literature-curated PPI databases in which pairwise associations of proteins published in the scientific literature are deposited. However, PPIs may not be clearly labelled as physical interactions affecting the quality of the entire dataset. In order to obtain a high-quality gold standard dataset for PPIs between human immunodeficiency virus (HIV-1) and its human host, we adopted a crowd-sourcing approach. We collected expert opinions and utilized an expectation-maximization based approach to estimate expert labeling quality. These estimates are used to infer the probability of a reported PPI actually being a direct physical interaction given the set of expert opinions. The effectiveness of our approach is demonstrated through synthetic data experiments and a high quality physical interaction network between HIV and human proteins is obtained. Since many literature-curated databases suffer from similar challenges, the framework described herein could be utilized in refining other databases. The curated data is available at

Original languageEnglish (US)
Pages (from-to)318-329
Number of pages12
JournalPacific Symposium on Biocomputing
StatePublished - 2015
Externally publishedYes
Event20th Pacific Symposium on Biocomputing, PSB 2015 - Big Island, United States
Duration: Jan 4 2015Jan 8 2015


  • Crowd-Sourcing
  • Literature Curated Databases
  • Protein-protein Interactions

ASJC Scopus subject areas

  • Biomedical Engineering
  • Computational Theory and Mathematics


Dive into the research topics of 'Refining literature curated protein interactions using expert opinions'. Together they form a unique fingerprint.

Cite this