TY - GEN
T1 - Spatial Audio Empowered Smart speakers with Xblock - A Pose-Adaptive Crosstalk Cancellation Algorithm for Free-moving Users
AU - Liu, Frank
AU - Narsipur, Anish
AU - Kemeklis, Andrew
AU - Song, Lucy
AU - Likamwa, Robert
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/5/9
Y1 - 2023/5/9
N2 - Smart IoT Speakers, while connected over a network, currently only produce sounds that come directly from the individual devices. We envision a future where smart speakers collaboratively produce a fabric of spatial audio, capable of perceptually placing sound in a range of locations in physical space. This could provide audio cues in homes, offices and public spaces that are flexibly linked to various positions. The perception of spatialized audio relies on binaural cues, especially the time difference and the level difference of incident sound at a user's left and right ears. Traditional stereo speakers cannot create the spatialization perception for a user when playing binaural audio due to auditory crosstalk, as each ear hears a combination of both speaker outputs. We present Xblock, a novel time-domain pose-adaptive crosstalk cancellation technique that creates a spatial audio perception over a pair of speakers using knowledge of the user's head pose and speaker positions. We build a prototype smart speaker IoT system empowered by Xblock, explore the effectiveness of Xblock through signal analysis, and discuss future perceptual user studies and future work.
AB - Smart IoT Speakers, while connected over a network, currently only produce sounds that come directly from the individual devices. We envision a future where smart speakers collaboratively produce a fabric of spatial audio, capable of perceptually placing sound in a range of locations in physical space. This could provide audio cues in homes, offices and public spaces that are flexibly linked to various positions. The perception of spatialized audio relies on binaural cues, especially the time difference and the level difference of incident sound at a user's left and right ears. Traditional stereo speakers cannot create the spatialization perception for a user when playing binaural audio due to auditory crosstalk, as each ear hears a combination of both speaker outputs. We present Xblock, a novel time-domain pose-adaptive crosstalk cancellation technique that creates a spatial audio perception over a pair of speakers using knowledge of the user's head pose and speaker positions. We build a prototype smart speaker IoT system empowered by Xblock, explore the effectiveness of Xblock through signal analysis, and discuss future perceptual user studies and future work.
KW - algorithm
KW - crosstalk cancellation
KW - internet of things
KW - spatial audio
UR - http://www.scopus.com/inward/record.url?scp=85159780259&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85159780259&partnerID=8YFLogxK
U2 - 10.1145/3576914.3589563
DO - 10.1145/3576914.3589563
M3 - Conference contribution
AN - SCOPUS:85159780259
T3 - ACM International Conference Proceeding Series
SP - 285
EP - 291
BT - Proceedings of 2023 Cyber-Physical Systems and Internet-of-Things Week, CPS-IoT Week 2023 - Workshops
PB - Association for Computing Machinery
T2 - 2023 Cyber-Physical Systems and Internet-of-Things Week, CPS-IoT Week 2023
Y2 - 9 May 2023 through 12 May 2023
ER -