Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models

Ian C. Ballard; Samuel McClure

doi:10.1016/j.jneumeth.2019.01.006

Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models

Ian C. Ballard, Samuel McClure

Psychology

Research output: Contribution to journal › Article › peer-review

24 Scopus citations

Abstract

Background: Reinforcement learning models provide excellent descriptions of learning in multiple species across a variety of tasks. Many researchers are interested in relating parameters of reinforcement learning models to neural measures, psychological variables or experimental manipulations. We demonstrate that parameter identification is difficult because a range of parameter values provide approximately equal quality fits to data. This identification problem has a large impact on power: we show that a researcher who wants to detect a medium sized correlation (r =.3) with 80% power between a variable and learning rate must collect 60% more subjects than specified by a typical power analysis in order to account for the noise introduced by model fitting. New method: We derive a Bayesian optimal model fitting technique that takes advantage of information contained in choices and reaction times to constrain parameter estimates. Results: We show using simulation and empirical data that this method substantially improves the ability to recover learning rates. Comparison with existing methods: We compare this method against the use of Bayesian priors. We show in simulations that the combined use of Bayesian priors and reaction times confers the highest parameter identifiability. However, in real data where the priors may have been misspecified, the use of Bayesian priors interferes with the ability of reaction time data to improve parameter identifiability. Conclusions: We present a simple technique that takes advantage of readily available data to substantially improve the quality of inferences that can be drawn from parameters of reinforcement learning models.

Original language	English (US)
Pages (from-to)	37-44
Number of pages	8
Journal	Journal of Neuroscience Methods
Volume	317
DOIs	https://doi.org/10.1016/j.jneumeth.2019.01.006
State	Published - Apr 1 2019

Keywords

Delay discounting
Intertemporal choice
Parameter estimation
Power
Q-learning
Reproducibility
Striatum

ASJC Scopus subject areas

General Neuroscience

Access to Document

10.1016/j.jneumeth.2019.01.006

Cite this

@article{0abfe565bbef42dfae1fc0670e09deb7,

title = "Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models",

abstract = "Background: Reinforcement learning models provide excellent descriptions of learning in multiple species across a variety of tasks. Many researchers are interested in relating parameters of reinforcement learning models to neural measures, psychological variables or experimental manipulations. We demonstrate that parameter identification is difficult because a range of parameter values provide approximately equal quality fits to data. This identification problem has a large impact on power: we show that a researcher who wants to detect a medium sized correlation (r =.3) with 80% power between a variable and learning rate must collect 60% more subjects than specified by a typical power analysis in order to account for the noise introduced by model fitting. New method: We derive a Bayesian optimal model fitting technique that takes advantage of information contained in choices and reaction times to constrain parameter estimates. Results: We show using simulation and empirical data that this method substantially improves the ability to recover learning rates. Comparison with existing methods: We compare this method against the use of Bayesian priors. We show in simulations that the combined use of Bayesian priors and reaction times confers the highest parameter identifiability. However, in real data where the priors may have been misspecified, the use of Bayesian priors interferes with the ability of reaction time data to improve parameter identifiability. Conclusions: We present a simple technique that takes advantage of readily available data to substantially improve the quality of inferences that can be drawn from parameters of reinforcement learning models.",

keywords = "Delay discounting, Intertemporal choice, Parameter estimation, Power, Q-learning, Reproducibility, Striatum",

author = "Ballard, {Ian C.} and Samuel McClure",

note = "Funding Information: The authors would like to thank Elliott Wimmer for providing data, Yuan Chang Leong for feedback and the NSF GRFP and NSF IGERT NSF grants 0801700 and 1634179 for providing training support for I.B. Publisher Copyright: {\textcopyright} 2019 Elsevier B.V.",

year = "2019",

month = apr,

day = "1",

doi = "10.1016/j.jneumeth.2019.01.006",

language = "English (US)",

volume = "317",

pages = "37--44",

journal = "Journal of Neuroscience Methods",

issn = "0165-0270",

publisher = "Elsevier",

}

TY - JOUR

T1 - Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models

AU - Ballard, Ian C.

AU - McClure, Samuel

N1 - Funding Information: The authors would like to thank Elliott Wimmer for providing data, Yuan Chang Leong for feedback and the NSF GRFP and NSF IGERT NSF grants 0801700 and 1634179 for providing training support for I.B. Publisher Copyright: © 2019 Elsevier B.V.

PY - 2019/4/1

Y1 - 2019/4/1

N2 - Background: Reinforcement learning models provide excellent descriptions of learning in multiple species across a variety of tasks. Many researchers are interested in relating parameters of reinforcement learning models to neural measures, psychological variables or experimental manipulations. We demonstrate that parameter identification is difficult because a range of parameter values provide approximately equal quality fits to data. This identification problem has a large impact on power: we show that a researcher who wants to detect a medium sized correlation (r =.3) with 80% power between a variable and learning rate must collect 60% more subjects than specified by a typical power analysis in order to account for the noise introduced by model fitting. New method: We derive a Bayesian optimal model fitting technique that takes advantage of information contained in choices and reaction times to constrain parameter estimates. Results: We show using simulation and empirical data that this method substantially improves the ability to recover learning rates. Comparison with existing methods: We compare this method against the use of Bayesian priors. We show in simulations that the combined use of Bayesian priors and reaction times confers the highest parameter identifiability. However, in real data where the priors may have been misspecified, the use of Bayesian priors interferes with the ability of reaction time data to improve parameter identifiability. Conclusions: We present a simple technique that takes advantage of readily available data to substantially improve the quality of inferences that can be drawn from parameters of reinforcement learning models.

AB - Background: Reinforcement learning models provide excellent descriptions of learning in multiple species across a variety of tasks. Many researchers are interested in relating parameters of reinforcement learning models to neural measures, psychological variables or experimental manipulations. We demonstrate that parameter identification is difficult because a range of parameter values provide approximately equal quality fits to data. This identification problem has a large impact on power: we show that a researcher who wants to detect a medium sized correlation (r =.3) with 80% power between a variable and learning rate must collect 60% more subjects than specified by a typical power analysis in order to account for the noise introduced by model fitting. New method: We derive a Bayesian optimal model fitting technique that takes advantage of information contained in choices and reaction times to constrain parameter estimates. Results: We show using simulation and empirical data that this method substantially improves the ability to recover learning rates. Comparison with existing methods: We compare this method against the use of Bayesian priors. We show in simulations that the combined use of Bayesian priors and reaction times confers the highest parameter identifiability. However, in real data where the priors may have been misspecified, the use of Bayesian priors interferes with the ability of reaction time data to improve parameter identifiability. Conclusions: We present a simple technique that takes advantage of readily available data to substantially improve the quality of inferences that can be drawn from parameters of reinforcement learning models.

KW - Delay discounting

KW - Intertemporal choice

KW - Parameter estimation

KW - Power

KW - Q-learning

KW - Reproducibility

KW - Striatum

UR - http://www.scopus.com/inward/record.url?scp=85061319908&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061319908&partnerID=8YFLogxK

U2 - 10.1016/j.jneumeth.2019.01.006

DO - 10.1016/j.jneumeth.2019.01.006

M3 - Article

C2 - 30664916

AN - SCOPUS:85061319908

SN - 0165-0270

VL - 317

SP - 37

EP - 44

JO - Journal of Neuroscience Methods

JF - Journal of Neuroscience Methods

ER -

Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this