This paper discusses an optimal method to decide phonemic sequence using frame-level likelihood of phonemes and the statistics of their duration and connectivity. In the results of phoneme recognition in continuous speech, there are often many deletion and insertion errors. Therefore, it is important to reduce such errors to realize highly advanced continuous speech recognition system. Our algorithm is based on DP method. The duration of each phoneme is expressed by stochastic model, and the connectivity of phonemes is determined on the basis of phonetical and phonological knowledge. Furthermore, we also propose its application for word detection using a method to decide phrase boundary by prosodic information. As the result, the performance of the speaker dependent recognition is improved to 94.6% (3.7% insertions, 3.5% deletions) for word utterance and 67.1% (17.7%, 16.4%) for sentence utterance, respectively. And the performance of word detection is 69.0% for independent words. These scores are much better than those obtained in our previous system.
|出版ステータス||Published - 1992|
|イベント||2nd International Conference on Spoken Language Processing, ICSLP 1992 - Banff, Canada|
継続期間: 1992 10月 13 → 1992 10月 16
|Conference||2nd International Conference on Spoken Language Processing, ICSLP 1992|
|Period||92/10/13 → 92/10/16|
ASJC Scopus subject areas