Modeling Clustered Data with Very Few Clusters

Daniel McNeish, Laura M. Stapleton

Research output: Contribution to journalArticlepeer-review

253 Scopus citations


Small-sample inference with clustered data has received increased attention recently in the methodological literature, with several simulation studies being presented on the small-sample behavior of many methods. However, nearly all previous studies focus on a single class of methods (e.g., only multilevel models, only corrections to sandwich estimators), and the differential performance of various methods that can be implemented to accommodate clustered data with very few clusters is largely unknown, potentially due to the rigid disciplinary preferences. Furthermore, a majority of these studies focus on scenarios with 15 or more clusters and feature unrealistically simple data-generation models with very few predictors. This article, motivated by an applied educational psychology cluster randomized trial, presents a simulation study that simultaneously addresses the extreme small sample and differential performance (estimation bias, Type I error rates, and relative power) of 12 methods to account for clustered data with a model that features a more realistic number of predictors. The motivating data are then modeled with each method, and results are compared. Results show that generalized estimating equations perform poorly; the choice of Bayesian prior distributions affects performance; and fixed effect models perform quite well. Limitations and implications for applications are also discussed.

Original languageEnglish (US)
Pages (from-to)495-518
Number of pages24
JournalMultivariate Behavioral Research
Issue number4
StatePublished - Jul 3 2016
Externally publishedYes


  • Bayesian
  • GEE
  • HLM
  • cluster randomized trial
  • fixed effect model
  • multilevel model
  • small sample

ASJC Scopus subject areas

  • Statistics and Probability
  • Experimental and Cognitive Psychology
  • Arts and Humanities (miscellaneous)


Dive into the research topics of 'Modeling Clustered Data with Very Few Clusters'. Together they form a unique fingerprint.

Cite this