Improvement of robot audition by interfacing sound source separation and automatic speech recognition with missing feature theory

Shun'ichi Yamamoto*, Kazuhiro Nakadai, Hiroshi Tsujino, Toshio Yokoyama, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Conference contribution

30 被引用数 (Scopus)

抄録

We have been developed robot audition system using the active direction-pass filter (ADPF) with the Scattering Theory, and demonstrated that the humanoid SIG could separate and recognize three simultaneous speeches originating from different directions. This is the first result that a robot can listen to several things simultaneously. However, its general applicability to other robots is not yet confirmed. Since automatic speech recognition (ASR) requires direction- and speaker-dependent acoustic models, it is difficult to adapt various kinds of environments. In addition ASR with lots of acoustic models causes slow processing. In this paper, these three problems are resolved. First, we confirmed the generality of the ADPF by applying it to two humanoids, SIG2 and Replie, under different environments. Next, we present the new interface between ADPF and ASR based on the Missing Feature Theory, which masks broken features of separated sound to make them unavailable to ASR. This new interface improved the recognition performance of three simultaneous speeches up to about 90%. Finally, since the ASR uses only a single acoustic model that is direction- and speaker-independent and created under clean environments, the processing of the whole system was made very light and fast.

本文言語English
ホスト出版物のタイトルProceedings - IEEE International Conference on Robotics and Automation
ページ1517-1523
ページ数7
2004
2
出版ステータスPublished - 2004
外部発表はい
イベントProceedings- 2004 IEEE International Conference on Robotics and Automation - New Orleans, LA, United States
継続期間: 2004 4月 262004 5月 1

Other

OtherProceedings- 2004 IEEE International Conference on Robotics and Automation
国/地域United States
CityNew Orleans, LA
Period04/4/2604/5/1

ASJC Scopus subject areas

  • ソフトウェア
  • 制御およびシステム工学

フィンガープリント

「Improvement of robot audition by interfacing sound source separation and automatic speech recognition with missing feature theory」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル