Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer

Marc Delcroix*, Tomohiro Nakatani, Shinji Watanabe

*この研究の対応する著者

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

It is well known that automatic speech recognition performs poorly in presence of noise or reverberation. Much research has been undertaken on model adaptation and speech enhancement to increase the robustness of speech recognizers. Model adaptation is effective to remove static mismatch between speech features and acoustic model parameters, but may not cope well with dynamic mismatch. Speech enhancement approaches can reduce dynamic perturbations, but often do not interconnect well with speech recognizer. There seems to be a lack of optimal way to combine these two approaches. In this paper we propose introducing the dynamic capabilities of speech enhancement into a static adaptation scheme. We focus on variance adaptation, and propose a novel parametric variance model that includes static and dynamic components. The dynamic component is derived from a speech enhancement pre-process, and the parameters of the model are optimized using an adaptive training scheme. An evaluation of the method with a speech dereverberation for preprocessing revealed that a 80 % relative error rate reduction was possible compared with the recognition of dereverberated speech, and the final error rate was 5.4 % which is close to that of clean speech (1.2%).

本文言語English
ホスト出版物のタイトル2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
ページ4073-4076
ページ数4
DOI
出版ステータスPublished - 2008
外部発表はい
イベント2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States
継続期間: 2008 3月 312008 4月 4

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

Conference

Conference2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
国/地域United States
CityLas Vegas, NV
Period08/3/3108/4/4

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル