TY - JOUR
T1 - Study of the performance of automatic speech recognition systems in speakers with Parkinson's Disease
AU - Moro-Velazquez, Laureano
AU - Cho, Jaejin
AU - Watanabe, Shinji
AU - Hasegawa-Johnson, Mark A.
AU - Scharenborg, Odette
AU - Kim, Heejin
AU - Dehak, Najim
N1 - Funding Information:
Authors want to thank Juan I. Godino-Llorente and Jorge A. Gomez-Garcia from Universidad Politecnica de Madrid for sharing their invaluable corpus Neurovoz.
Publisher Copyright:
Copyright © 2019 ISCA
PY - 2019
Y1 - 2019
N2 - Parkinson's Disease (PD) affects motor capabilities of patients, who in some cases need to use human-computer assistive technologies to regain independence. The objective of this work is to study in detail the differences in error patterns from state-of-the-art Automatic Speech Recognition (ASR) systems on speech from people with and without PD. Two different speech recognizers (attention-based end-to-end and Deep Neural Network - Hidden Markov Models hybrid systems) were trained on a Spanish language corpus and subsequently tested on speech from 43 speakers with PD and 46 without PD. The differences related to error rates, substitutions, insertions and deletions of characters and phonetic units between the two groups were analyzed, showing that the word error rate is 27% higher in speakers with PD than in control speakers, with a moderated correlation between that rate and the developmental stage of the disease. The errors were related to all manner classes, and were more pronounced in the vowel /u/. This study is the first to evaluate ASR systems' responses to speech from patients at different stages of PD in Spanish. The analyses showed general trends but individual speech deficits must be studied in the future when designing new ASR systems for this population.
AB - Parkinson's Disease (PD) affects motor capabilities of patients, who in some cases need to use human-computer assistive technologies to regain independence. The objective of this work is to study in detail the differences in error patterns from state-of-the-art Automatic Speech Recognition (ASR) systems on speech from people with and without PD. Two different speech recognizers (attention-based end-to-end and Deep Neural Network - Hidden Markov Models hybrid systems) were trained on a Spanish language corpus and subsequently tested on speech from 43 speakers with PD and 46 without PD. The differences related to error rates, substitutions, insertions and deletions of characters and phonetic units between the two groups were analyzed, showing that the word error rate is 27% higher in speakers with PD than in control speakers, with a moderated correlation between that rate and the developmental stage of the disease. The errors were related to all manner classes, and were more pronounced in the vowel /u/. This study is the first to evaluate ASR systems' responses to speech from patients at different stages of PD in Spanish. The analyses showed general trends but individual speech deficits must be studied in the future when designing new ASR systems for this population.
KW - Automatic speech recognition
KW - Deep neural networks
KW - Dysarthria
KW - Parkinson's disease
KW - Word error rate
UR - http://www.scopus.com/inward/record.url?scp=85074718312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074718312&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2019-2993
DO - 10.21437/Interspeech.2019-2993
M3 - Conference article
AN - SCOPUS:85074718312
SN - 2308-457X
VL - 2019-September
SP - 3875
EP - 3879
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019
Y2 - 15 September 2019 through 19 September 2019
ER -