TY - GEN
T1 - Discriminative approach to dynamic variance adaptation for noisy speech recognition
AU - Delcroix, Marc
AU - Watanabe, Shinji
AU - Nakatani, Tomohiro
AU - Nakamura, Atsushi
PY - 2011
Y1 - 2011
N2 - The performance of automatic speech recognition suffers from severe degradation in the presence of noise or reverberation. One conventional approach for handling such acoustic distortions is to use a speech enhancement technique prior to recognition. However, most speech enhancement techniques introduce artifacts that create a mismatch between the enhanced speech features and the acoustic model used for recognition, therefore limiting the improvement in recognition performance. Recently, there has been increased interest in methods capable of compensating for such a mismatch by accounting for the feature variance during decoding. In this paper, we propose to estimate the feature variance using an adaptation technique based on a discriminative criterion. In an experiment using the Aurora2 database, the proposed method could achieve significant digit error rate reduction compared with a spectral subtraction pre-processor, and using a discriminative criterion for adaptation provided further improvement compared with maximum likelihood estimation.
AB - The performance of automatic speech recognition suffers from severe degradation in the presence of noise or reverberation. One conventional approach for handling such acoustic distortions is to use a speech enhancement technique prior to recognition. However, most speech enhancement techniques introduce artifacts that create a mismatch between the enhanced speech features and the acoustic model used for recognition, therefore limiting the improvement in recognition performance. Recently, there has been increased interest in methods capable of compensating for such a mismatch by accounting for the feature variance during decoding. In this paper, we propose to estimate the feature variance using an adaptation technique based on a discriminative criterion. In an experiment using the Aurora2 database, the proposed method could achieve significant digit error rate reduction compared with a spectral subtraction pre-processor, and using a discriminative criterion for adaptation provided further improvement compared with maximum likelihood estimation.
KW - MMI
KW - Model Adaptation
KW - Noise reduction
KW - Robust ASR
KW - Variance Compensation
UR - http://www.scopus.com/inward/record.url?scp=79961165363&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79961165363&partnerID=8YFLogxK
U2 - 10.1109/HSCMA.2011.5942414
DO - 10.1109/HSCMA.2011.5942414
M3 - Conference contribution
AN - SCOPUS:79961165363
SN - 9781457709999
T3 - 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11
SP - 7
EP - 12
BT - 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11
T2 - 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11
Y2 - 30 May 2011 through 1 June 2011
ER -