Multiclass composite N-gram language model based on connection direction

Hirofumi Yamamoto*, Yoshinori Sagisaka

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

The authors propose a method to generate a compact, highly reliable language model for speech recognition based on the efficient classification of words. In this method, the connectedness with the words immediately before and after the word is taken to represent separate attributes, and individual classification is performed for each word. The resulting composite word class is created separately based on the distribution of words connected before or after. As a result, classification of classes is efficient and reliable. In a multiclass composite N-gram, which uses the proposed method for the variable-order N-gram to bring in chain words, the entry size is reduced to one-tenth, and the word recognition rate is higher than that of a conventional composite N-gram for particles or variable-length word arrays.

Original languageEnglish
Pages (from-to)108-114
Number of pages7
JournalSystems and Computers in Japan
Volume34
Issue number7
DOIs
Publication statusPublished - 2003 Jun 30
Externally publishedYes

Keywords

  • Automatic class classification
  • Chain words
  • Class N-gram
  • Variable-order N-gram

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Multiclass composite N-gram language model based on connection direction'. Together they form a unique fingerprint.

Cite this