Deep Learning Based Prediction of Hypernasality for Clinical Applications

Vikram C. Mathad, Kathy Chapman, Julie Liss, Nancy Scherer, Visar Berisha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Hypernasality refers to the perception of excessive nasal resonance during the production of oral sounds. Existing methods for automatic assessment of hypernasality from speech are based on machine learning models trained on disordered speech databases rated by speech-language pathologists. However, the performance of such systems critically depends on the availability of hypernasal speech samples and the reliability of clinical ratings. In this paper, we propose a new approach that uses the speech samples from healthy controls to model the acoustic characteristics of nasalized speech. Using healthy speech samples, we develop a 4-class deep neural network classifier for the classification of nasal consonants, oral consonants, nasalized vowels, and oral vowels. We use the classifier to compute nasalization scores for clinical speech samples and show that the resulting scores correlate with clinical perception of hypernasality. The proposed approach is evaluated on the speech samples of speakers with dysarthria and cleft lip and palate speakers.

Original languageEnglish (US)
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6554-6558
Number of pages5
ISBN (Electronic)9781509066315
DOIs
StatePublished - May 2020
Externally publishedYes
Event2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
Duration: May 4 2020May 8 2020

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2020-May
ISSN (Print)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
Country/TerritorySpain
CityBarcelona
Period5/4/205/8/20

Keywords

  • Cleft lip and palate
  • deep neural network
  • dysarthric speech
  • hypernasality
  • velopharyngeal dysfunction

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Deep Learning Based Prediction of Hypernasality for Clinical Applications'. Together they form a unique fingerprint.

Cite this