Abstract
This paper presents a novel nonlinear regression model for estimating heterogeneous treatment effects, geared specifically towards situations with small effect sizes, heterogeneous effects, and strong confounding by observables. Standard nonlinear regression models, which may work quite well for prediction, have two notable weaknesses when used to estimate heterogeneous treatment effects. First, they can yield badly biased estimates of treatment effects when fit to data with strong confounding. The Bayesian causal forest model presented in this paper avoids this problem by directly incorporating an estimate of the propensity function in the specification of the response model, implicitly inducing a covariatedependent prior on the regression function. Second, standard approaches to response surface modeling do not provide adequate control over the strength of regularization over effect heterogeneity. The Bayesian causal forest model permits treatment effect heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively "shrink to homogeneity". While we focus on observational data, our methods are equally useful for inferring heterogeneous treatment effects from randomized controlled experiments where careful regularization is somewhat less complicated but no less important. We illustrate these benefits via the reanalysis of an observational study assessing the causal effects of smoking on medical expenditures as well as extensive simulation studies.
Original language | English (US) |
---|---|
Pages (from-to) | 965-1056 |
Number of pages | 92 |
Journal | Bayesian Analysis |
Volume | 15 |
Issue number | 3 |
DOIs | |
State | Published - 2020 |
Keywords
- Bayesian
- Causal inference
- Heterogeneous treatment effects
- Machine learning
- Predictor-dependent priors
- Regression trees
- Regularization
- Shrinkage
ASJC Scopus subject areas
- Statistics and Probability
- Applied Mathematics