Classifying Refugee Status Using Common Features in EMR**

Malia Morrison, Vanessa Nobles, Crista E. Johnson-Agbakwu, Celeste Bailey, Li Liu

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Automated and accurate identification of refugees in healthcare databases is a critical first step to investigate healthcare needs of this vulnerable population and improve health disparities. In this study, we developed a machine-learning method, named refugee identification system (RIS) to address this need. We curated a data set consisting of 103 refugees and 930 non-refugees in Arizona. We compiled de-identified individual-level information including age, primary language, and noise-masked home address, state-level refugee resettlement statistics, and world language statistics. We then performed feature engineering to convert language and masked address into quantitative features. Finally, we built a random forest model to classify refugee and non-refugees. RIS achieved high classification accuracy (overall accuracy=0.97, specificity=0.99, sensitivity=0.85, positive predictive value=0.88, negative predictive value=0.98, and area under receiver operating characteristic curve=0.98). RIS is customizable for refugee identification outside Arizona. Its application enables large-scale investigation of refugee healthcare needs and improvement of health disparities.

Original languageEnglish (US)
Article numbere202200651
JournalChemistry and Biodiversity
Volume19
Issue number10
DOIs
StatePublished - Oct 2022

Keywords

  • health disparity
  • health informatics
  • machine learning
  • refugee health

ASJC Scopus subject areas

  • Bioengineering
  • Biochemistry
  • General Chemistry
  • Molecular Medicine
  • Molecular Biology

Fingerprint

Dive into the research topics of 'Classifying Refugee Status Using Common Features in EMR**'. Together they form a unique fingerprint.

Cite this