抄録
Speech stream segregation is presented as a new speech enhancement for automatic speech recognition. Two issues are addressed: speech stream segregation from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition. Speech stream segregation is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, and substituting non-harmonic residue for non-harmonic parts of groups. The main problem in interfacing speech stream segregation with HMM-based speech recognition is how to improve the degradation of recognition performance due to special distortion of segregated sounds, which is caused mainly by transfer function of a binaural input. Our solution is to re-train the parameters of HMM with training data binauralized for four directions. Experiments with 500 mixtures of two women's utterances of a word showed that the cumulative accuracy of word recognition up to the 10th candidate of each woman's utterance is, on average, 75%.
本文言語 | English |
---|---|
ホスト出版物のタイトル | International Conference on Spoken Language Processing, ICSLP, Proceedings |
編集者 | Anon |
Place of Publication | Piscataway, NJ, United States |
出版社 | IEEE |
ページ | 2356-2359 |
ページ数 | 4 |
巻 | 4 |
出版ステータス | Published - 1996 |
外部発表 | はい |
イベント | Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA 継続期間: 1996 10月 3 → 1996 10月 6 |
Other
Other | Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) |
---|---|
City | Philadelphia, PA, USA |
Period | 96/10/3 → 96/10/6 |
ASJC Scopus subject areas
- コンピュータ サイエンス(全般)