A language model adaptation using multiple varied corpora

H. Yamamoto, Y. Sagisaka

研究成果: Conference contribution

4 被引用数 (Scopus)

抄録

A new language model adaptation scheme is proposed to cope with multiple varied speech recognition tasks. Both topic difference and sentence style difference resulting from the speaker's role are reflected in the proposed language model adaptation. An adaptation is carried out using two different language corpora where only the topic or speaker's style is matched. New word clustering techniques are introduced to extract the topic or style dependency separately. Word neighboring characteristics in the two adaptation source data are regarded as different features in this clustering. All words are classified into commonly used word classes and topic or style dependent classes. Furthermore, target topic and sentence style dependent words and their neighboring characteristics are emphasized according to their frequency in the adaptation target data. In the evaluation experiment, the proposed method shows a 13% lower perplexity and a 9% lower word error rate in continuous speech recognition compared with the conventional adaptation method.

本文言語English
ホスト出版物のタイトル2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Conference Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ389-392
ページ数4
ISBN(電子版)078037343X, 9780780373433
DOI
出版ステータスPublished - 2001
イベントIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Madonna di Campiglio, Italy
継続期間: 2001 12月 92001 12月 13

出版物シリーズ

名前2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Conference Proceedings

Other

OtherIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001
国/地域Italy
CityMadonna di Campiglio
Period01/12/901/12/13

ASJC Scopus subject areas

  • ハードウェアとアーキテクチャ
  • 電子工学および電気工学

フィンガープリント

「A language model adaptation using multiple varied corpora」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル