TY - CONF
T1 - PHONEME RECOGNITION IN VARIOUS STYLES OF UTTERANCE BASED ON MUTUAL INFORMATION CRITERION
AU - Okawa, Shigeki
AU - Kobayashi, Tetsunori
AU - Shirai, Katsuhiko
N1 - Funding Information:
The authors are grateful to the members of Laboratory for Spoken Language Processing of Waseda University for their help and discussions. This work is partly supported by the Grant-in-Aid for Scientific Research from the Ministry Education, Science and Culture of Japan, No. 05241103.
Publisher Copyright:
© 1994 3rd International Conference on Spoken Language Processing, ICSLP 1994. All rights reserved.
PY - 1994
Y1 - 1994
N2 - This paper discusses a highly reliable phoneme recognition method in various styles of utterance based on mutual information criterion. Mutual information is a good measure to build an effective phoneme dictionary in the process of optimal selection of acoustic features and integration of clusters. Using VQ code sequences organized by the hierarchical clustering method, phonemic likelihoods for each frame can be calculated. Phoneme recognition is performed with applying phonemic duration and bigram constraints of phonemes. Also, we cover an iterative training mechanism of the phoneme dictionary. The correct rate for phoneme is improved to 90.5% (8.4% insertion, 7.0% deletion) in the speaker independent recognition experiment for the continuous utterance.
AB - This paper discusses a highly reliable phoneme recognition method in various styles of utterance based on mutual information criterion. Mutual information is a good measure to build an effective phoneme dictionary in the process of optimal selection of acoustic features and integration of clusters. Using VQ code sequences organized by the hierarchical clustering method, phonemic likelihoods for each frame can be calculated. Phoneme recognition is performed with applying phonemic duration and bigram constraints of phonemes. Also, we cover an iterative training mechanism of the phoneme dictionary. The correct rate for phoneme is improved to 90.5% (8.4% insertion, 7.0% deletion) in the speaker independent recognition experiment for the continuous utterance.
UR - http://www.scopus.com/inward/record.url?scp=85009222101&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85009222101&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85009222101
SP - 1911
EP - 1914
T2 - 3rd International Conference on Spoken Language Processing, ICSLP 1994
Y2 - 18 September 1994 through 22 September 1994
ER -