抄録
Out-of-vocabulary (OOV) problems are frequently seen when adapting a language model to another task where there are some observed word classes but few individual words, such as names, places and other proper nouns. Simple task adaptation cannot handle this problem properly. In this paper, for task dependent OOV words in the noun category, we adopt a hierarchical language model. In this modeling, the lower class model expressing word phonotactics does not require any additional task dependent corpora for training. It can be trained independent of the upper class model of conventional word class N-grams, as the proposed hierarchical model clearly separates Inter-word characteristics and Intra-word characteristics. This independent-layered training capability makes it possible to apply this model to general vocabularies and tasks in combination with conventional language model adaptation techniques. Speech recognition experiments showed a 19-point increase in word accuracy (from 54% to 73%) in the with-OOV sentences, and comparable accuracy (85%) in the without-OOV sentences, compared with a conventional adapted model. This improvement corresponds to the performance when all OOVs are ideally registered in a dictionary.
本文言語 | English |
---|---|
ホスト出版物のタイトル | EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology |
出版社 | International Speech Communication Association |
ページ | 221-224 |
ページ数 | 4 |
出版ステータス | Published - 2003 |
イベント | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland 継続期間: 2003 9月 1 → 2003 9月 4 |
Other
Other | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 |
---|---|
国/地域 | Switzerland |
City | Geneva |
Period | 03/9/1 → 03/9/4 |
ASJC Scopus subject areas
- コンピュータ サイエンスの応用
- ソフトウェア
- 言語学および言語
- 通信