Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment

Lingfeng Xu; Kimberly D. Mueller; Julie Liss; Visar Berisha

doi:10.1109/ICASSP49357.2023.10097265

Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment

Lingfeng Xu, Kimberly D. Mueller, Julie Liss, Visar Berisha

Health Solutions, College of (CHS)

Research output: Contribution to journal › Conference article › peer-review

Abstract

Training robust clinical speech-based models that generalize requires large sample sizes because speech is variable and high-dimensional. Researchers have turned to foundational models, such as the Bidirectional Encoder Representations from Transformers (BERT), to generate lower-dimensional embeddings, and then finetuned the models for a specific down-stream clinical task. While there is empirical evidence that this approach is helpful, a recent study reveals that the embeddings generated by BERT models tend to be highly correlated, which makes the downstream models difficult to fine-tune, particularly in the small sample size regime. In this work, we propose a new regularization scheme to penalize correlated embeddings during fine tuning of BERT and apply the approach to speech-based assessment of cognitive impairment. Compared to existing methods, the proposed method yields lower estimation errors and smaller false alarm rates in a Mini-Mental State Examination (MMSE) score regression task.

Original language	English (US)
Journal	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
DOIs	https://doi.org/10.1109/ICASSP49357.2023.10097265
State	Published - 2023
Event	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece Duration: Jun 4 2023 → Jun 10 2023

Keywords

Language modeling
clinical speech analytics
decorrelated features

ASJC Scopus subject areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP49357.2023.10097265

Cite this

@article{a5da14cc7ea44774b178a772dae1b914,

title = "Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment",

abstract = "Training robust clinical speech-based models that generalize requires large sample sizes because speech is variable and high-dimensional. Researchers have turned to foundational models, such as the Bidirectional Encoder Representations from Transformers (BERT), to generate lower-dimensional embeddings, and then finetuned the models for a specific down-stream clinical task. While there is empirical evidence that this approach is helpful, a recent study reveals that the embeddings generated by BERT models tend to be highly correlated, which makes the downstream models difficult to fine-tune, particularly in the small sample size regime. In this work, we propose a new regularization scheme to penalize correlated embeddings during fine tuning of BERT and apply the approach to speech-based assessment of cognitive impairment. Compared to existing methods, the proposed method yields lower estimation errors and smaller false alarm rates in a Mini-Mental State Examination (MMSE) score regression task.",

keywords = "Language modeling, clinical speech analytics, decorrelated features",

author = "Lingfeng Xu and Mueller, {Kimberly D.} and Julie Liss and Visar Berisha",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 ; Conference date: 04-06-2023 Through 10-06-2023",

year = "2023",

doi = "10.1109/ICASSP49357.2023.10097265",

language = "English (US)",

journal = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

issn = "1520-6149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment

AU - Xu, Lingfeng

AU - Mueller, Kimberly D.

AU - Liss, Julie

AU - Berisha, Visar

PY - 2023

Y1 - 2023

N2 - Training robust clinical speech-based models that generalize requires large sample sizes because speech is variable and high-dimensional. Researchers have turned to foundational models, such as the Bidirectional Encoder Representations from Transformers (BERT), to generate lower-dimensional embeddings, and then finetuned the models for a specific down-stream clinical task. While there is empirical evidence that this approach is helpful, a recent study reveals that the embeddings generated by BERT models tend to be highly correlated, which makes the downstream models difficult to fine-tune, particularly in the small sample size regime. In this work, we propose a new regularization scheme to penalize correlated embeddings during fine tuning of BERT and apply the approach to speech-based assessment of cognitive impairment. Compared to existing methods, the proposed method yields lower estimation errors and smaller false alarm rates in a Mini-Mental State Examination (MMSE) score regression task.

AB - Training robust clinical speech-based models that generalize requires large sample sizes because speech is variable and high-dimensional. Researchers have turned to foundational models, such as the Bidirectional Encoder Representations from Transformers (BERT), to generate lower-dimensional embeddings, and then finetuned the models for a specific down-stream clinical task. While there is empirical evidence that this approach is helpful, a recent study reveals that the embeddings generated by BERT models tend to be highly correlated, which makes the downstream models difficult to fine-tune, particularly in the small sample size regime. In this work, we propose a new regularization scheme to penalize correlated embeddings during fine tuning of BERT and apply the approach to speech-based assessment of cognitive impairment. Compared to existing methods, the proposed method yields lower estimation errors and smaller false alarm rates in a Mini-Mental State Examination (MMSE) score regression task.

KW - Language modeling

KW - clinical speech analytics

KW - decorrelated features

UR - http://www.scopus.com/inward/record.url?scp=85180403225&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85180403225&partnerID=8YFLogxK

U2 - 10.1109/ICASSP49357.2023.10097265

DO - 10.1109/ICASSP49357.2023.10097265

M3 - Conference article

AN - SCOPUS:85180403225

SN - 1520-6149

JO - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

JF - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

Y2 - 4 June 2023 through 10 June 2023

ER -

Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this