A discriminative acoustic-prosodic approach for measuring local entrainment

Megan M. Willi; Stephanie A. Borrie; Tyson S. Barrett; Ming Tu; Visar Berisha

doi:10.21437/Interspeech.2018-1419

A discriminative acoustic-prosodic approach for measuring local entrainment

Megan M. Willi, Stephanie A. Borrie, Tyson S. Barrett, Ming Tu, Visar Berisha

Research output: Contribution to journal › Conference article › peer-review

5 Scopus citations

Abstract

Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainment in the speech domain and opera-tionalize an algorithm based on this: acoustic-prosodic features that capture entrainment should be maximally different between real conversations involving two partners and sham conversations generated by randomly mixing the speaking turns from the original two conversational partners. We propose an approach for measuring local entrainment that quantifies alignment of behavior on a turn-by-turn basis, projecting the differences between interlocutors' acoustic-prosodic features for a given turn onto a discriminative feature subspace that maximizes the difference between real and sham conversations. We evaluate the method using the derived features to drive a classifier aiming to predict an objective measure of conversational success (i.e., low versus high), on a corpus of task-oriented conversations. The proposed entrainment approach achieves 72% classification accuracy using a Naive Bayes classifier, outperforming three previously established approaches evaluated on the same conversational corpus.

Original language	English (US)
Pages (from-to)	581-585
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	2018-September
DOIs	https://doi.org/10.21437/Interspeech.2018-1419
State	Published - 2018
Event	19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India Duration: Sep 2 2018 → Sep 6 2018

Keywords

Conversational Success
Entrainment
Linear Discriminant Analysis
Spoken Dialogue Systems

ASJC Scopus subject areas

Language and Linguistics
Human-Computer Interaction
Signal Processing
Software
Modeling and Simulation

Access to Document

10.21437/Interspeech.2018-1419

Cite this

@article{ff0bca4d842843d19d77098ed74709b6,

title = "A discriminative acoustic-prosodic approach for measuring local entrainment",

abstract = "Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainment in the speech domain and opera-tionalize an algorithm based on this: acoustic-prosodic features that capture entrainment should be maximally different between real conversations involving two partners and sham conversations generated by randomly mixing the speaking turns from the original two conversational partners. We propose an approach for measuring local entrainment that quantifies alignment of behavior on a turn-by-turn basis, projecting the differences between interlocutors' acoustic-prosodic features for a given turn onto a discriminative feature subspace that maximizes the difference between real and sham conversations. We evaluate the method using the derived features to drive a classifier aiming to predict an objective measure of conversational success (i.e., low versus high), on a corpus of task-oriented conversations. The proposed entrainment approach achieves 72% classification accuracy using a Naive Bayes classifier, outperforming three previously established approaches evaluated on the same conversational corpus.",

keywords = "Conversational Success, Entrainment, Linear Discriminant Analysis, Spoken Dialogue Systems",

author = "Willi, {Megan M.} and Borrie, {Stephanie A.} and Barrett, {Tyson S.} and Ming Tu and Visar Berisha",

note = "Funding Information: This research was supported by the National Institute of Deafness and Other Communication Disorders, National Institutes of Health Grants R21DC016084-01 and R01DC006859. Publisher Copyright: {\textcopyright} 2018 International Speech Communication Association. All rights reserved.; 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 ; Conference date: 02-09-2018 Through 06-09-2018",

year = "2018",

doi = "10.21437/Interspeech.2018-1419",

language = "English (US)",

volume = "2018-September",

pages = "581--585",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - A discriminative acoustic-prosodic approach for measuring local entrainment

AU - Willi, Megan M.

AU - Borrie, Stephanie A.

AU - Barrett, Tyson S.

AU - Tu, Ming

AU - Berisha, Visar

N1 - Funding Information: This research was supported by the National Institute of Deafness and Other Communication Disorders, National Institutes of Health Grants R21DC016084-01 and R01DC006859. Publisher Copyright: © 2018 International Speech Communication Association. All rights reserved.

PY - 2018

Y1 - 2018

N2 - Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainment in the speech domain and opera-tionalize an algorithm based on this: acoustic-prosodic features that capture entrainment should be maximally different between real conversations involving two partners and sham conversations generated by randomly mixing the speaking turns from the original two conversational partners. We propose an approach for measuring local entrainment that quantifies alignment of behavior on a turn-by-turn basis, projecting the differences between interlocutors' acoustic-prosodic features for a given turn onto a discriminative feature subspace that maximizes the difference between real and sham conversations. We evaluate the method using the derived features to drive a classifier aiming to predict an objective measure of conversational success (i.e., low versus high), on a corpus of task-oriented conversations. The proposed entrainment approach achieves 72% classification accuracy using a Naive Bayes classifier, outperforming three previously established approaches evaluated on the same conversational corpus.

AB - Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainment in the speech domain and opera-tionalize an algorithm based on this: acoustic-prosodic features that capture entrainment should be maximally different between real conversations involving two partners and sham conversations generated by randomly mixing the speaking turns from the original two conversational partners. We propose an approach for measuring local entrainment that quantifies alignment of behavior on a turn-by-turn basis, projecting the differences between interlocutors' acoustic-prosodic features for a given turn onto a discriminative feature subspace that maximizes the difference between real and sham conversations. We evaluate the method using the derived features to drive a classifier aiming to predict an objective measure of conversational success (i.e., low versus high), on a corpus of task-oriented conversations. The proposed entrainment approach achieves 72% classification accuracy using a Naive Bayes classifier, outperforming three previously established approaches evaluated on the same conversational corpus.

KW - Conversational Success

KW - Entrainment

KW - Linear Discriminant Analysis

KW - Spoken Dialogue Systems

UR - http://www.scopus.com/inward/record.url?scp=85054993468&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054993468&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2018-1419

DO - 10.21437/Interspeech.2018-1419

M3 - Conference article

AN - SCOPUS:85054993468

SN - 2308-457X

VL - 2018-September

SP - 581

EP - 585

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018

Y2 - 2 September 2018 through 6 September 2018

ER -

A discriminative acoustic-prosodic approach for measuring local entrainment

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this