New speech enhancement: Speech stream segregation

Hiroshi G. Okuno*, Tomohiro Nakatani, Takeshi Kawabata

*この研究の対応する著者

研究成果: Conference contribution

4 被引用数 (Scopus)

抄録

Speech stream segregation is presented as a new speech enhancement for automatic speech recognition. Two issues are addressed: speech stream segregation from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition. Speech stream segregation is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, and substituting non-harmonic residue for non-harmonic parts of groups. The main problem in interfacing speech stream segregation with HMM-based speech recognition is how to improve the degradation of recognition performance due to special distortion of segregated sounds, which is caused mainly by transfer function of a binaural input. Our solution is to re-train the parameters of HMM with training data binauralized for four directions. Experiments with 500 mixtures of two women's utterances of a word showed that the cumulative accuracy of word recognition up to the 10th candidate of each woman's utterance is, on average, 75%.

本文言語English
ホスト出版物のタイトルInternational Conference on Spoken Language Processing, ICSLP, Proceedings
編集者 Anon
Place of PublicationPiscataway, NJ, United States
出版社IEEE
ページ2356-2359
ページ数4
4
出版ステータスPublished - 1996
外部発表はい
イベントProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA
継続期間: 1996 10月 31996 10月 6

Other

OtherProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)
CityPhiladelphia, PA, USA
Period96/10/396/10/6

ASJC Scopus subject areas

  • コンピュータ サイエンス(全般)

フィンガープリント

「New speech enhancement: Speech stream segregation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル