Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generation

Rya Takeda*, Shu N.Ichi Yamamoto, Kazunori Komatoni, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

Robot audition systems require capabilities for sound source separation and the recognition of separated sounds, since we hear a mixture of sounds in our daily lives, especially mixed of speech. We report a robot audition system with a pair of omni-directional microphones embedded in a humanoid that recognizes two simultaneous talkers. It first separates the sound sources by Independent Component Analysis (ICA) with the single-input multiple-output (SIMO) model. Then, spectral distortion in the separated sounds is then estimated to generate missing feature masks. Finally, the separated sounds are recognized by missing-feature theory (MFT) for Automatic Speech Recognition (ASR). The novel aspects of our system involve estimates of spectral distortion in the temporal-frequency domain in terms of feature vectors and based on estimates error in SIMO-ICA signals. The resulting system outperformed the baseline robot audition system by 7 %.

本文言語English
ホスト出版物のタイトルINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
出版社International Speech Communication Association
ページ2302-2305
ページ数4
ISBN(印刷版)9781604234497
出版ステータスPublished - 2006 1月 1
外部発表はい
イベントINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP - Pittsburgh, PA, United States
継続期間: 2006 9月 172006 9月 21

出版物シリーズ

名前INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
5

Conference

ConferenceINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
国/地域United States
CityPittsburgh, PA
Period06/9/1706/9/21

ASJC Scopus subject areas

  • コンピュータ サイエンス(全般)

フィンガープリント

「Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル