Investigating the role of L1 in automatic pronunciation evaluation of L2 speech

Ming Tu; Anna Grabek; Julie Liss; Visar Berisha

doi:10.21437/Interspeech.2018-1350

Investigating the role of L1 in automatic pronunciation evaluation of L2 speech

Ming Tu, Anna Grabek, Julie Liss, Visar Berisha

Research output: Contribution to journal › Conference article › peer-review

25 Scopus citations

Abstract

Automatic pronunciation evaluation plays an important role in pronunciation training and second language education. This field draws heavily on concepts from automatic speech recognition (ASR) to quantify how close the pronunciation of nonnative speech is to native-like pronunciation. However, it is known that the formation of accent is related to pronunciation patterns of both the target language (L2) and the speaker's first language (L1). In this paper, we propose to use two native speech acoustic models, one trained on L2 speech and the other trained on L1 speech. We develop two sets of measurements that can be extracted from two acoustic models given accented speech. A new utterance-level feature extraction scheme is used to convert these measurements into a fixed-dimension vector which is used as an input to a statistical model to predict the accentedness of a speaker. On a data set consisting of speakers from 4 different L1 backgrounds, we show that the proposed system yields improved correlation with human evaluators compared to systems only using the L2 acoustic model.

Original language	English (US)
Pages (from-to)	1636-1640
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	2018-September
DOIs	https://doi.org/10.21437/Interspeech.2018-1350
State	Published - 2018
Event	19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India Duration: Sep 2 2018 → Sep 6 2018

Keywords

Accentedness
Automatic speech recognition
Pronunciation evaluation

ASJC Scopus subject areas

Language and Linguistics
Human-Computer Interaction
Signal Processing
Software
Modeling and Simulation

Access to Document

10.21437/Interspeech.2018-1350

Cite this

@article{02dc974d9d564e53a8a0c89e62749617,

title = "Investigating the role of L1 in automatic pronunciation evaluation of L2 speech",

abstract = "Automatic pronunciation evaluation plays an important role in pronunciation training and second language education. This field draws heavily on concepts from automatic speech recognition (ASR) to quantify how close the pronunciation of nonnative speech is to native-like pronunciation. However, it is known that the formation of accent is related to pronunciation patterns of both the target language (L2) and the speaker's first language (L1). In this paper, we propose to use two native speech acoustic models, one trained on L2 speech and the other trained on L1 speech. We develop two sets of measurements that can be extracted from two acoustic models given accented speech. A new utterance-level feature extraction scheme is used to convert these measurements into a fixed-dimension vector which is used as an input to a statistical model to predict the accentedness of a speaker. On a data set consisting of speakers from 4 different L1 backgrounds, we show that the proposed system yields improved correlation with human evaluators compared to systems only using the L2 acoustic model.",

keywords = "Accentedness, Automatic speech recognition, Pronunciation evaluation",

author = "Ming Tu and Anna Grabek and Julie Liss and Visar Berisha",

note = "Publisher Copyright: {\textcopyright} 2018 International Speech Communication Association. All rights reserved.; 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 ; Conference date: 02-09-2018 Through 06-09-2018",

year = "2018",

doi = "10.21437/Interspeech.2018-1350",

language = "English (US)",

volume = "2018-September",

pages = "1636--1640",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - Investigating the role of L1 in automatic pronunciation evaluation of L2 speech

AU - Tu, Ming

AU - Grabek, Anna

AU - Liss, Julie

AU - Berisha, Visar

PY - 2018

Y1 - 2018

N2 - Automatic pronunciation evaluation plays an important role in pronunciation training and second language education. This field draws heavily on concepts from automatic speech recognition (ASR) to quantify how close the pronunciation of nonnative speech is to native-like pronunciation. However, it is known that the formation of accent is related to pronunciation patterns of both the target language (L2) and the speaker's first language (L1). In this paper, we propose to use two native speech acoustic models, one trained on L2 speech and the other trained on L1 speech. We develop two sets of measurements that can be extracted from two acoustic models given accented speech. A new utterance-level feature extraction scheme is used to convert these measurements into a fixed-dimension vector which is used as an input to a statistical model to predict the accentedness of a speaker. On a data set consisting of speakers from 4 different L1 backgrounds, we show that the proposed system yields improved correlation with human evaluators compared to systems only using the L2 acoustic model.

AB - Automatic pronunciation evaluation plays an important role in pronunciation training and second language education. This field draws heavily on concepts from automatic speech recognition (ASR) to quantify how close the pronunciation of nonnative speech is to native-like pronunciation. However, it is known that the formation of accent is related to pronunciation patterns of both the target language (L2) and the speaker's first language (L1). In this paper, we propose to use two native speech acoustic models, one trained on L2 speech and the other trained on L1 speech. We develop two sets of measurements that can be extracted from two acoustic models given accented speech. A new utterance-level feature extraction scheme is used to convert these measurements into a fixed-dimension vector which is used as an input to a statistical model to predict the accentedness of a speaker. On a data set consisting of speakers from 4 different L1 backgrounds, we show that the proposed system yields improved correlation with human evaluators compared to systems only using the L2 acoustic model.

KW - Accentedness

KW - Automatic speech recognition

KW - Pronunciation evaluation

UR - http://www.scopus.com/inward/record.url?scp=85054974214&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054974214&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2018-1350

DO - 10.21437/Interspeech.2018-1350

M3 - Conference article

AN - SCOPUS:85054974214

SN - 2308-457X

VL - 2018-September

SP - 1636

EP - 1640

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018

Y2 - 2 September 2018 through 6 September 2018

ER -

Investigating the role of L1 in automatic pronunciation evaluation of L2 speech

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this