Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generation

Rya Takeda*, Shu N.Ichi Yamamoto, Kazunori Komatoni, Tetsuya Ogata, Hiroshi G. Okuno

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Robot audition systems require capabilities for sound source separation and the recognition of separated sounds, since we hear a mixture of sounds in our daily lives, especially mixed of speech. We report a robot audition system with a pair of omni-directional microphones embedded in a humanoid that recognizes two simultaneous talkers. It first separates the sound sources by Independent Component Analysis (ICA) with the single-input multiple-output (SIMO) model. Then, spectral distortion in the separated sounds is then estimated to generate missing feature masks. Finally, the separated sounds are recognized by missing-feature theory (MFT) for Automatic Speech Recognition (ASR). The novel aspects of our system involve estimates of spectral distortion in the temporal-frequency domain in terms of feature vectors and based on estimates error in SIMO-ICA signals. The resulting system outperformed the baseline robot audition system by 7 %.

Original languageEnglish
Title of host publicationINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
PublisherInternational Speech Communication Association
Pages2302-2305
Number of pages4
ISBN (Print)9781604234497
Publication statusPublished - 2006 Jan 1
Externally publishedYes
EventINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP - Pittsburgh, PA, United States
Duration: 2006 Sept 172006 Sept 21

Publication series

NameINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
Volume5

Conference

ConferenceINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
Country/TerritoryUnited States
CityPittsburgh, PA
Period06/9/1706/9/21

Keywords

  • ICA
  • Missing-feature mask
  • Robot audition

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generation'. Together they form a unique fingerprint.

Cite this