Link Prediction for Partially Observed Networks

Yunpeng Zhao; Yun Jhong Wu; Elizaveta Levina; Ji Zhu

doi:10.1080/10618600.2017.1286243

Link Prediction for Partially Observed Networks

Yunpeng Zhao, Yun Jhong Wu, Elizaveta Levina, Ji Zhu

Research output: Contribution to journal › Article › peer-review

27 Scopus citations

Abstract

Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples, that is, edges known for certain to be absent, which creates a difficulty for existing supervised learning approaches. We develop a new method that treats the observed network as a sample of the true network with different sampling rates for positive (true edges) and negative (absent edges) examples. We obtain a relative ranking of potential links by their probabilities, using information on network topology as well as node covariates if available. The method relies on the intuitive assumption that if two pairs of nodes are similar, the probabilities of these pairs forming an edge are also similar. Empirically, the method performs well under many settings, including when the observed network is sparse. We apply the method to a protein–protein interaction network and a school friendship network.

Original language	English (US)
Pages (from-to)	725-733
Number of pages	9
Journal	Journal of Computational and Graphical Statistics
Volume	26
Issue number	3
DOIs	https://doi.org/10.1080/10618600.2017.1286243
State	Published - Jul 3 2017
Externally published	Yes

Keywords

Link prediction
Ranking
Social networks

ASJC Scopus subject areas

Discrete Mathematics and Combinatorics
Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1080/10618600.2017.1286243

Cite this

@article{d6419c3cf38f454099291dbf13a0ac73,

title = "Link Prediction for Partially Observed Networks",

abstract = "Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples, that is, edges known for certain to be absent, which creates a difficulty for existing supervised learning approaches. We develop a new method that treats the observed network as a sample of the true network with different sampling rates for positive (true edges) and negative (absent edges) examples. We obtain a relative ranking of potential links by their probabilities, using information on network topology as well as node covariates if available. The method relies on the intuitive assumption that if two pairs of nodes are similar, the probabilities of these pairs forming an edge are also similar. Empirically, the method performs well under many settings, including when the observed network is sparse. We apply the method to a protein–protein interaction network and a school friendship network.",

keywords = "Link prediction, Ranking, Social networks",

author = "Yunpeng Zhao and Wu, {Yun Jhong} and Elizaveta Levina and Ji Zhu",

note = "Publisher Copyright: {\textcopyright} 2017 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.",

year = "2017",

month = jul,

day = "3",

doi = "10.1080/10618600.2017.1286243",

language = "English (US)",

volume = "26",

pages = "725--733",

journal = "Journal of Computational and Graphical Statistics",

issn = "1061-8600",

publisher = "American Statistical Association",

number = "3",

}

TY - JOUR

T1 - Link Prediction for Partially Observed Networks

AU - Zhao, Yunpeng

AU - Wu, Yun Jhong

AU - Levina, Elizaveta

AU - Zhu, Ji

PY - 2017/7/3

Y1 - 2017/7/3

N2 - Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples, that is, edges known for certain to be absent, which creates a difficulty for existing supervised learning approaches. We develop a new method that treats the observed network as a sample of the true network with different sampling rates for positive (true edges) and negative (absent edges) examples. We obtain a relative ranking of potential links by their probabilities, using information on network topology as well as node covariates if available. The method relies on the intuitive assumption that if two pairs of nodes are similar, the probabilities of these pairs forming an edge are also similar. Empirically, the method performs well under many settings, including when the observed network is sparse. We apply the method to a protein–protein interaction network and a school friendship network.

AB - Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples, that is, edges known for certain to be absent, which creates a difficulty for existing supervised learning approaches. We develop a new method that treats the observed network as a sample of the true network with different sampling rates for positive (true edges) and negative (absent edges) examples. We obtain a relative ranking of potential links by their probabilities, using information on network topology as well as node covariates if available. The method relies on the intuitive assumption that if two pairs of nodes are similar, the probabilities of these pairs forming an edge are also similar. Empirically, the method performs well under many settings, including when the observed network is sparse. We apply the method to a protein–protein interaction network and a school friendship network.

KW - Link prediction

KW - Ranking

KW - Social networks

UR - http://www.scopus.com/inward/record.url?scp=85021940941&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021940941&partnerID=8YFLogxK

U2 - 10.1080/10618600.2017.1286243

DO - 10.1080/10618600.2017.1286243

M3 - Article

AN - SCOPUS:85021940941

SN - 1061-8600

VL - 26

SP - 725

EP - 733

JO - Journal of Computational and Graphical Statistics

JF - Journal of Computational and Graphical Statistics

IS - 3

ER -

Link Prediction for Partially Observed Networks

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this