Abstract
We propose a method of robust language model ing for a small amount of training text corpus. In this method, the word bigram and the class bigram are combined using a weighting function of preceding word frequency. We made experiments on speech recogni tion using JNAS speech corpus. As the results, it was proved that the performance of the class combined bi gram is equivalent to that of the word bigram trained with 2.5 larger size of corpus. We also made experi ments using sports news dialogue on TV. Recognition accuracy of the class-combined bigram was 83.3% that was 5.5 point higher than that of the word bigram.
Original language | English |
---|---|
Pages | 1599-1602 |
Number of pages | 4 |
Publication status | Published - 1999 |
Event | 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 - Budapest, Hungary Duration: 1999 Sept 5 → 1999 Sept 9 |
Conference
Conference | 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 |
---|---|
Country/Territory | Hungary |
City | Budapest |
Period | 99/9/5 → 99/9/9 |
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Linguistics and Language
- Communication