Abstract
This paper proposes a new framework for processing rhythm in speech where temporal types are recognized using statistical models of mora durations. Temporal patterns, such as rhythm and tempo in speech, contain some basic information about communication through the spoken language. This information has not yet been fully used in speech recognition. This paper proposes that temporal types themselves be modeled and recognized by statistical models. Using the ASJ Continuous Speech Database, experiments for recognizing temporal types of bunsetsu (short phrases) were conducted. Approximately 72% of temporal types were identified correctly using these models, without using information about the length of pauses and fundamental frequencies. The recognized types were very consistent (approximately 94% were of the same types) for closed and open models. These results show the promising potential of the proposed framework.
Original language | English |
---|---|
Pages | 199-202 |
Number of pages | 4 |
Publication status | Published - 1994 |
Externally published | Yes |
Event | 3rd International Conference on Spoken Language Processing, ICSLP 1994 - Yokohama, Japan Duration: 1994 Sept 18 → 1994 Sept 22 |
Conference
Conference | 3rd International Conference on Spoken Language Processing, ICSLP 1994 |
---|---|
Country/Territory | Japan |
City | Yokohama |
Period | 94/9/18 → 94/9/22 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language