GENERATION OF PROSODY IN SPEECH SYNTHESIS USING LARGE SPEECH DATA-BASE

Naohiro Sakurai, Takemi Mochida, Tetsunori Kobayashi, Katsuhiko Shirai

研究成果: Paper査読

1 被引用数 (Scopus)

抄録

In order to improve the naturalness of synthetic speech in Japanese text-to-speech or concept-to-speech conversion, we introduce a new scheme to synthesize arbitrary speech sentences using the natural sentence speech data-base. In our synthesis method, a series of synthetic parameters is generated using patterns which are extracted from natural speech waveforms. In the first step, the basic sentence is selected from the data-base against a target sentence. The factors for the selection are phrase dependency structure(separation degree), number of mora, type of accent and phonemic labels. In the second step, if necessary, the basic accent-phrase is selected from the same data-base against the each target, accent-phrase. The factors considered in selecting the each accent-phrase are the separation degree, the number of mora, the type of accent and the phonemic labels. In the third step, pitch pattern is generated from those waveform units selected in the first and the second step. In the last step, the phonemic parameters are generated. These phonemic parameters for several morae are extracted on the former three steps. Therefore, in this step, we only have to replace the phonemic parameters for ill-suited morae. As the pitch pattern is generated using patterns directly extracted from real speech, it is expected to be more natural than any other pattern which is estimated by any model. We have examined this method on Japanese sentence speech to the present and affirmed that the synthetic sound preserves human-like features fairly well.

本文言語English
ページ747-750
ページ数4
出版ステータスPublished - 1994
イベント3rd International Conference on Spoken Language Processing, ICSLP 1994 - Yokohama, Japan
継続期間: 1994 9月 181994 9月 22

Conference

Conference3rd International Conference on Spoken Language Processing, ICSLP 1994
国/地域Japan
CityYokohama
Period94/9/1894/9/22

ASJC Scopus subject areas

  • 言語および言語学
  • 言語学および言語

フィンガープリント

「GENERATION OF PROSODY IN SPEECH SYNTHESIS USING LARGE SPEECH DATA-BASE」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル