Robot with two ears listens to more than two simultaneous utterances by exploiting harmonic structures

Yasuharu Hirasawa*, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Conference contribution

抄録

In real-world situations, people often hear more than two simultaneous sounds. For robots, when the number of sound sources exceeds that of sensors, the situation is called under-determined, and robots with two ears need to deal with this situation. Some studies on under-determined sound source separation use L1-norm minimization methods, but the performance of automatic speech recognition with separated speech signals is poor due to its spectral distortion. In this paper, a two-stage separation method to improve separation quality with low computational cost is presented. The first stage uses a L1-norm minimization method in order to extract the harmonic structures. The second stage exploits reliable harmonic structures to maintain acoustic features. Experiments that simulate three utterances recorded by two microphones in an anechoic chamber show that our method improves speech recognition correctness by about three points and is fast enough for real-time separation.

本文言語English
ホスト出版物のタイトルModern Approaches in Applied Intelligence - 24th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2011, Proceedings
ページ348-358
ページ数11
PART 1
DOI
出版ステータスPublished - 2011
外部発表はい
イベント24th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2011 - Syracuse, NY, United States
継続期間: 2011 6月 282011 7月 1

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
番号PART 1
6703 LNAI
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference24th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2011
国/地域United States
CitySyracuse, NY
Period11/6/2811/7/1

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Robot with two ears listens to more than two simultaneous utterances by exploiting harmonic structures」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル