TY - JOUR
T1 - Techniques for transferring host-pathogen protein interactions knowledge to new tasks
AU - Kshirsagar, Meghana
AU - Schleker, Sylvia
AU - Carbonell, Jaime
AU - Klein-Seetharaman, Judith
N1 - Publisher Copyright:
© 2015 Kshirsagar, Schleker, Carbonell and Klein-Seetharaman.
PY - 2015
Y1 - 2015
N2 - We consider the problem of building a model to predict protein-protein interactions (PPIs) between the bacterial species Salmonella Typhimurium and the plant host Arabidopsis thaliana which is a host-pathogen pair for which no known PPIs are available. To achieve this, we present approaches, which use homology and statistical learning methods called "transfer learning." In the transfer learning setting, the task of predicting PPIs between Arabidopsis and its pathogen S. Typhimurium is called the "target task." The presented approaches utilize labeled data i.e., known PPIs of other host-pathogen pairs (we call these PPIs the "source tasks"). The homology based approaches use heuristics based on biological intuition to predict PPIs. The transfer learning methods use the similarity of the PPIs from the source tasks to the target task to build a model. For a quantitative evaluation we consider Salmonella-mouse PPI prediction and some other host-pathogen tasks where known PPIs exist. We use metrics such as precision and recall and our results show that our methods perform well on the target task in various transfer settings. We present a brief qualitative analysis of the Arabidopsis-Salmonella predicted interactions. We filter the predictions from all approaches using Gene Ontology term enrichment and only those interactions involving Salmonella effectors. Thereby we observe that Arabidopsis proteins involved e.g., in transcriptional regulation, hormone mediated signaling and defense response may be affected by Salmonella.
AB - We consider the problem of building a model to predict protein-protein interactions (PPIs) between the bacterial species Salmonella Typhimurium and the plant host Arabidopsis thaliana which is a host-pathogen pair for which no known PPIs are available. To achieve this, we present approaches, which use homology and statistical learning methods called "transfer learning." In the transfer learning setting, the task of predicting PPIs between Arabidopsis and its pathogen S. Typhimurium is called the "target task." The presented approaches utilize labeled data i.e., known PPIs of other host-pathogen pairs (we call these PPIs the "source tasks"). The homology based approaches use heuristics based on biological intuition to predict PPIs. The transfer learning methods use the similarity of the PPIs from the source tasks to the target task to build a model. For a quantitative evaluation we consider Salmonella-mouse PPI prediction and some other host-pathogen tasks where known PPIs exist. We use metrics such as precision and recall and our results show that our methods perform well on the target task in various transfer settings. We present a brief qualitative analysis of the Arabidopsis-Salmonella predicted interactions. We filter the predictions from all approaches using Gene Ontology term enrichment and only those interactions involving Salmonella effectors. Thereby we observe that Arabidopsis proteins involved e.g., in transcriptional regulation, hormone mediated signaling and defense response may be affected by Salmonella.
KW - Host pathogen protein interactions
KW - Kernel mean matching
KW - Machine learning methods
KW - Plant pathogen protein interactions
KW - Protein interaction prediction
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=84927564776&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84927564776&partnerID=8YFLogxK
U2 - 10.3389/fmicb.2015.00036
DO - 10.3389/fmicb.2015.00036
M3 - Article
AN - SCOPUS:84927564776
SN - 1664-302X
VL - 6
JO - Frontiers in Microbiology
JF - Frontiers in Microbiology
IS - FEB
M1 - 36
ER -