Data selection by sequence summarizing neural network in mismatch condition training

Kateřina Žmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukáš Burget, Jan Cěrnocký

研究成果: Conference article査読

4 被引用数 (Scopus)

抄録

Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

本文言語English
ページ(範囲)2354-2358
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
08-12-September-2016
DOI
出版ステータスPublished - 2016
外部発表はい
イベント17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States
継続期間: 2016 9月 82016 9月 16

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「Data selection by sequence summarizing neural network in mismatch condition training」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル