STATISTICAL MODELING AND RECOGNITION OF RHYTHM IN SPEECH

Satoru Hayamizu, Kazuyo Tanaka

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper proposes a new framework for processing rhythm in speech where temporal types are recognized using statistical models of mora durations. Temporal patterns, such as rhythm and tempo in speech, contain some basic information about communication through the spoken language. This information has not yet been fully used in speech recognition. This paper proposes that temporal types themselves be modeled and recognized by statistical models. Using the ASJ Continuous Speech Database, experiments for recognizing temporal types of bunsetsu (short phrases) were conducted. Approximately 72% of temporal types were identified correctly using these models, without using information about the length of pauses and fundamental frequencies. The recognized types were very consistent (approximately 94% were of the same types) for closed and open models. These results show the promising potential of the proposed framework.

Original languageEnglish
Pages199-202
Number of pages4
Publication statusPublished - 1994
Externally publishedYes
Event3rd International Conference on Spoken Language Processing, ICSLP 1994 - Yokohama, Japan
Duration: 1994 Sept 181994 Sept 22

Conference

Conference3rd International Conference on Spoken Language Processing, ICSLP 1994
Country/TerritoryJapan
CityYokohama
Period94/9/1894/9/22

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'STATISTICAL MODELING AND RECOGNITION OF RHYTHM IN SPEECH'. Together they form a unique fingerprint.

Cite this