TY - JOUR
T1 - Employing computational linguistics techniques to identify limited patient health literacy
T2 - Findings from the ECLIPPSE study
AU - Schillinger, Dean
AU - Balyan, Renu
AU - Crossley, Scott A.
AU - McNamara, Danielle S.
AU - Liu, Jennifer Y.
AU - Karter, Andrew J.
N1 - Funding Information:
This work has been supported by grants NLM R01 LM012355 from the National Institutes of Health, NIDDK Centers for Diabetes Translational Research (P30 DK092924), R01 DK065664, NICHD R01 HD46113, Institute of Education Sciences, US Department of Education, through grant R305A180261 and Office of Naval Research grant (N00014‐17‐1‐2300). Joint Acknowledgment/Disclosure Statement:
Publisher Copyright:
© 2020 The Authors. Health Services Research published by Wiley Periodicals LLC on behalf of Health Research and Educational Trust
PY - 2021/2
Y1 - 2021/2
N2 - Objective: To develop novel, scalable, and valid literacy profiles for identifying limited health literacy patients by harnessing natural language processing. Data Source: With respect to the linguistic content, we analyzed 283 216 secure messages sent by 6941 diabetes patients to physicians within an integrated system's electronic portal. Sociodemographic, clinical, and utilization data were obtained via questionnaire and electronic health records. Study Design: Retrospective study used natural language processing and machine learning to generate five unique “Literacy Profiles” by employing various sets of linguistic indices: Flesch-Kincaid (LP_FK); basic indices of writing complexity, including lexical diversity (LP_LD) and writing quality (LP_WQ); and advanced indices related to syntactic complexity, lexical sophistication, and diversity, modeled from self-reported (LP_SR), and expert-rated (LP_Exp) health literacy. We first determined the performance of each literacy profile relative to self-reported and expert-rated health literacy to discriminate between high and low health literacy and then assessed Literacy Profiles’ relationships with known correlates of health literacy, such as patient sociodemographics and a range of health-related outcomes, including ratings of physician communication, medication adherence, diabetes control, comorbidities, and utilization. Principal Findings: LP_SR and LP_Exp performed best in discriminating between high and low self-reported (C-statistics: 0.86 and 0.58, respectively) and expert-rated health literacy (C-statistics: 0.71 and 0.87, respectively) and were significantly associated with educational attainment, race/ethnicity, Consumer Assessment of Provider and Systems (CAHPS) scores, adherence, glycemia, comorbidities, and emergency department visits. Conclusions: Since health literacy is a potentially remediable explanatory factor in health care disparities, the development of automated health literacy indicators represents a significant accomplishment with broad clinical and population health applications. Health systems could apply literacy profiles to efficiently determine whether quality of care and outcomes vary by patient health literacy; identify at-risk populations for targeting tailored health communications and self-management support interventions; and inform clinicians to promote improvements in individual-level care.
AB - Objective: To develop novel, scalable, and valid literacy profiles for identifying limited health literacy patients by harnessing natural language processing. Data Source: With respect to the linguistic content, we analyzed 283 216 secure messages sent by 6941 diabetes patients to physicians within an integrated system's electronic portal. Sociodemographic, clinical, and utilization data were obtained via questionnaire and electronic health records. Study Design: Retrospective study used natural language processing and machine learning to generate five unique “Literacy Profiles” by employing various sets of linguistic indices: Flesch-Kincaid (LP_FK); basic indices of writing complexity, including lexical diversity (LP_LD) and writing quality (LP_WQ); and advanced indices related to syntactic complexity, lexical sophistication, and diversity, modeled from self-reported (LP_SR), and expert-rated (LP_Exp) health literacy. We first determined the performance of each literacy profile relative to self-reported and expert-rated health literacy to discriminate between high and low health literacy and then assessed Literacy Profiles’ relationships with known correlates of health literacy, such as patient sociodemographics and a range of health-related outcomes, including ratings of physician communication, medication adherence, diabetes control, comorbidities, and utilization. Principal Findings: LP_SR and LP_Exp performed best in discriminating between high and low self-reported (C-statistics: 0.86 and 0.58, respectively) and expert-rated health literacy (C-statistics: 0.71 and 0.87, respectively) and were significantly associated with educational attainment, race/ethnicity, Consumer Assessment of Provider and Systems (CAHPS) scores, adherence, glycemia, comorbidities, and emergency department visits. Conclusions: Since health literacy is a potentially remediable explanatory factor in health care disparities, the development of automated health literacy indicators represents a significant accomplishment with broad clinical and population health applications. Health systems could apply literacy profiles to efficiently determine whether quality of care and outcomes vary by patient health literacy; identify at-risk populations for targeting tailored health communications and self-management support interventions; and inform clinicians to promote improvements in individual-level care.
KW - communication
KW - diabetes
KW - health literacy
KW - machine learning
KW - managed care
KW - natural language processing
KW - secure messaging
UR - http://www.scopus.com/inward/record.url?scp=85091296602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091296602&partnerID=8YFLogxK
U2 - 10.1111/1475-6773.13560
DO - 10.1111/1475-6773.13560
M3 - Article
C2 - 32966630
AN - SCOPUS:85091296602
SN - 0017-9124
VL - 56
SP - 132
EP - 144
JO - Health Services Research
JF - Health Services Research
IS - 1
ER -