Monte Carlo samplers for efficient network inference

Zeliha Kilic; Max Schweiger; Camille Moyer; Steve Pressé

doi:10.1371/journal.pcbi.1011256

Monte Carlo samplers for efficient network inference

Zeliha Kilic, Max Schweiger, Camille Moyer, Steve Pressé

Physics

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

Accessing information on an underlying network driving a biological process often involves interrupting the process and collecting snapshot data. When snapshot data are stochastic, the data’s structure necessitates a probabilistic description to infer underlying reaction networks. As an example, we may imagine wanting to learn gene state networks from the type of data collected in single molecule RNA fluorescence in situ hybridization (RNA-FISH). In the networks we consider, nodes represent network states, and edges represent biochemical reaction rates linking states. Simultaneously estimating the number of nodes and constituent parameters from snapshot data remains a challenging task in part on account of data uncertainty and timescale separations between kinetic parameters mediating the network. While parametric Bayesian methods learn parameters given a network structure (with known node numbers) with rigorously propagated measurement uncertainty, learning the number of nodes and parameters with potentially large timescale separations remain open questions. Here, we propose a Bayesian nonparametric framework and describe a hybrid Bayesian Markov Chain Monte Carlo (MCMC) sampler directly addressing these challenges. In particular, in our hybrid method, Hamiltonian Monte Carlo (HMC) leverages local posterior geometries in inference to explore the parameter space; Adaptive Metropolis Hastings (AMH) learns correlations between plausible parameter sets to efficiently propose probable models; and Parallel Tempering takes into account multiple models simultaneously with tempered information content to augment sampling efficiency. We apply our method to synthetic data mimicking single molecule RNA-FISH, a popular snapshot method in probing transcriptional networks to illustrate the identified challenges inherent to learning dynamical models from these snapshots and how our method addresses them.

Original language	English (US)
Article number	e1011256
Journal	PLoS computational biology
Volume	19
Issue number	7 JULY
DOIs	https://doi.org/10.1371/journal.pcbi.1011256
State	Published - Jul 2023

ASJC Scopus subject areas

Ecology, Evolution, Behavior and Systematics
Modeling and Simulation
Ecology
Molecular Biology
Genetics
Cellular and Molecular Neuroscience
Computational Theory and Mathematics

Access to Document

10.1371/journal.pcbi.1011256

Cite this

@article{e56a1af789204398a1f0e9066b079b1d,

title = "Monte Carlo samplers for efficient network inference",

abstract = "Accessing information on an underlying network driving a biological process often involves interrupting the process and collecting snapshot data. When snapshot data are stochastic, the data{\textquoteright}s structure necessitates a probabilistic description to infer underlying reaction networks. As an example, we may imagine wanting to learn gene state networks from the type of data collected in single molecule RNA fluorescence in situ hybridization (RNA-FISH). In the networks we consider, nodes represent network states, and edges represent biochemical reaction rates linking states. Simultaneously estimating the number of nodes and constituent parameters from snapshot data remains a challenging task in part on account of data uncertainty and timescale separations between kinetic parameters mediating the network. While parametric Bayesian methods learn parameters given a network structure (with known node numbers) with rigorously propagated measurement uncertainty, learning the number of nodes and parameters with potentially large timescale separations remain open questions. Here, we propose a Bayesian nonparametric framework and describe a hybrid Bayesian Markov Chain Monte Carlo (MCMC) sampler directly addressing these challenges. In particular, in our hybrid method, Hamiltonian Monte Carlo (HMC) leverages local posterior geometries in inference to explore the parameter space; Adaptive Metropolis Hastings (AMH) learns correlations between plausible parameter sets to efficiently propose probable models; and Parallel Tempering takes into account multiple models simultaneously with tempered information content to augment sampling efficiency. We apply our method to synthetic data mimicking single molecule RNA-FISH, a popular snapshot method in probing transcriptional networks to illustrate the identified challenges inherent to learning dynamical models from these snapshots and how our method addresses them.",

author = "Zeliha Kilic and Max Schweiger and Camille Moyer and Steve Press{\'e}",

note = "Funding Information: Funding: S.P. acknowledges support from NIH NIGMS (R01GM130745), NIH NIGMS (R01GM134426), NIH NIGMS MIRA (R35GM148237). All authors received salaries from NIH during the study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Publisher Copyright: {\textcopyright} 2023 Kilic et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.",

year = "2023",

month = jul,

doi = "10.1371/journal.pcbi.1011256",

language = "English (US)",

volume = "19",

journal = "PLoS computational biology",

issn = "1553-734X",

publisher = "Public Library of Science",

number = "7 JULY",

}

TY - JOUR

T1 - Monte Carlo samplers for efficient network inference

AU - Kilic, Zeliha

AU - Schweiger, Max

AU - Moyer, Camille

AU - Pressé, Steve

N1 - Funding Information: Funding: S.P. acknowledges support from NIH NIGMS (R01GM130745), NIH NIGMS (R01GM134426), NIH NIGMS MIRA (R35GM148237). All authors received salaries from NIH during the study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Publisher Copyright: © 2023 Kilic et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PY - 2023/7

Y1 - 2023/7

N2 - Accessing information on an underlying network driving a biological process often involves interrupting the process and collecting snapshot data. When snapshot data are stochastic, the data’s structure necessitates a probabilistic description to infer underlying reaction networks. As an example, we may imagine wanting to learn gene state networks from the type of data collected in single molecule RNA fluorescence in situ hybridization (RNA-FISH). In the networks we consider, nodes represent network states, and edges represent biochemical reaction rates linking states. Simultaneously estimating the number of nodes and constituent parameters from snapshot data remains a challenging task in part on account of data uncertainty and timescale separations between kinetic parameters mediating the network. While parametric Bayesian methods learn parameters given a network structure (with known node numbers) with rigorously propagated measurement uncertainty, learning the number of nodes and parameters with potentially large timescale separations remain open questions. Here, we propose a Bayesian nonparametric framework and describe a hybrid Bayesian Markov Chain Monte Carlo (MCMC) sampler directly addressing these challenges. In particular, in our hybrid method, Hamiltonian Monte Carlo (HMC) leverages local posterior geometries in inference to explore the parameter space; Adaptive Metropolis Hastings (AMH) learns correlations between plausible parameter sets to efficiently propose probable models; and Parallel Tempering takes into account multiple models simultaneously with tempered information content to augment sampling efficiency. We apply our method to synthetic data mimicking single molecule RNA-FISH, a popular snapshot method in probing transcriptional networks to illustrate the identified challenges inherent to learning dynamical models from these snapshots and how our method addresses them.

AB - Accessing information on an underlying network driving a biological process often involves interrupting the process and collecting snapshot data. When snapshot data are stochastic, the data’s structure necessitates a probabilistic description to infer underlying reaction networks. As an example, we may imagine wanting to learn gene state networks from the type of data collected in single molecule RNA fluorescence in situ hybridization (RNA-FISH). In the networks we consider, nodes represent network states, and edges represent biochemical reaction rates linking states. Simultaneously estimating the number of nodes and constituent parameters from snapshot data remains a challenging task in part on account of data uncertainty and timescale separations between kinetic parameters mediating the network. While parametric Bayesian methods learn parameters given a network structure (with known node numbers) with rigorously propagated measurement uncertainty, learning the number of nodes and parameters with potentially large timescale separations remain open questions. Here, we propose a Bayesian nonparametric framework and describe a hybrid Bayesian Markov Chain Monte Carlo (MCMC) sampler directly addressing these challenges. In particular, in our hybrid method, Hamiltonian Monte Carlo (HMC) leverages local posterior geometries in inference to explore the parameter space; Adaptive Metropolis Hastings (AMH) learns correlations between plausible parameter sets to efficiently propose probable models; and Parallel Tempering takes into account multiple models simultaneously with tempered information content to augment sampling efficiency. We apply our method to synthetic data mimicking single molecule RNA-FISH, a popular snapshot method in probing transcriptional networks to illustrate the identified challenges inherent to learning dynamical models from these snapshots and how our method addresses them.

UR - http://www.scopus.com/inward/record.url?scp=85165517413&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85165517413&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1011256

DO - 10.1371/journal.pcbi.1011256

M3 - Article

C2 - 37463156

AN - SCOPUS:85165517413

SN - 1553-734X

VL - 19

JO - PLoS computational biology

JF - PLoS computational biology

IS - 7 JULY

M1 - e1011256

ER -

Monte Carlo samplers for efficient network inference

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this