抄録
This paper discusses the use of tree-based phone modeling to describe acoustic variations of speech, and its application to speech recognition system. There are many sources of variabilities that affect the realization of a phoneme: phonetic contexts, speakers, stress, speaking rates and so on. Explicit modeling with these sources of variabilities will give more accurate and more detailed phone models, but needs a large amount of speech data for training. Tree-based phone modeling is studied to solve this problem with three case studies: phone models with large VQ codebook sizes, decision tree clustering, and speaker-clustering. They are tested on speakerindependent continuous speech recognition experiments with a 991 word vocabulary. Tree-based phone modeling is shown to produce improvement in all three cases and to provide a good guide to provide trainability and generalizability.
本文言語 | English |
---|---|
ページ | 705-708 |
ページ数 | 4 |
出版ステータス | Published - 1990 |
外部発表 | はい |
イベント | 1st International Conference on Spoken Language Processing, ICSLP 1990 - Kobe, Japan 継続期間: 1990 11月 18 → 1990 11月 22 |
Conference
Conference | 1st International Conference on Spoken Language Processing, ICSLP 1990 |
---|---|
国/地域 | Japan |
City | Kobe |
Period | 90/11/18 → 90/11/22 |
ASJC Scopus subject areas
- 言語および言語学
- 言語学および言語