Evaluating evidence for invariant items: A Bayes factor applied to testing measurement invariance in IRT models

Josine Verhagen, Roy Levy, Roger E. Millsap, Jean Paul Fox

Research output: Contribution to journalArticlepeer-review

17 Scopus citations


When comparing test or questionnaire scores between groups, an important assumption is that the questionnaire or test items are measurement invariant: that they measure the underlying construct in the same way in each group. The main goal of tests for measurement invariance is to establish whether support exists for the null hypothesis of invariance. Bayesian hypothesis testing enables researchers to investigate this null hypothesis, where evidence in favor of invariance is quantified using the Bayes factor. A Bayes factor for the investigation of measurement invariance assumptions of test items for randomly selected groups was developed by Verhagen and Fox (2013a). For specific groups or measurement occasions, a different Bayes factor test is proposed here, which directly evaluates item parameter differences between groups. This test is compared to recently developed frequentist measurement invariance tests based on the Wald test in a simulation study. The close-comparison with the Wald-test performance validates the proposed Bayes factor and shows the advantages of the additional information given by the Bayes factor. Both tests are applied to the investigation of measurement invariance of a geometry test (CBASE) to illustrate the use of the Bayes factor test for measurement invariance.

Original languageEnglish (US)
Pages (from-to)171-182
Number of pages12
JournalJournal of Mathematical Psychology
StatePublished - Jun 1 2016


  • Bayes factor
  • Bayesian hypothesis testing
  • Bayesian modeling
  • IRT
  • Latent variable models
  • Measurement invariance

ASJC Scopus subject areas

  • General Psychology
  • Applied Mathematics


Dive into the research topics of 'Evaluating evidence for invariant items: A Bayes factor applied to testing measurement invariance in IRT models'. Together they form a unique fingerprint.

Cite this