SpeakBySinging: Converting singing voices to speaking voices while retaining voice timbre

Shimpei Aso*, Takeshi Saitou, Masataka Goto, Katsutoshi Itoyama, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Conference contribution

9 被引用数 (Scopus)

抄録

This paper describes a singing-to-speaking synthesis system called "SpeakBySinging" that can synthesize a speaking voice from an input singing voice and the song lyrics. The system controls three acoustic features that determine the difference between speaking and singing voices: the fundamental frequency (F0), phoneme duration, and power (volume). By changing these features of a singing voice, the system synthesizes a speaking voice while retaining the timbre of the singing voice. The system first analyzes the singing voice to extract the F0 contour, the duration of each phoneme of the lyrics, and the power. These features are then converted to target values that are obtained by feeding the lyrics into a traditional text-to-speech (TTS) system. The system finally generates a speaking voice that preserves the timbre of the singing voice but has speech-like features. Experimental results show that SpeakBySinging can convert singing voices into speaking voices whose timbre is almost the same as the original singing voices.

本文言語English
ホスト出版物のタイトル13th International Conference on Digital Audio Effects, DAFx 2010 Proceedings
出版ステータスPublished - 2010
外部発表はい
イベント13th International Conference on Digital Audio Effects, DAFx 2010 - Graz, Austria
継続期間: 2010 9月 62010 9月 10

出版物シリーズ

名前Proceedings of the International Conference on Digital Audio Effects, DAFx
ISSN(印刷版)2413-6700
ISSN(電子版)2413-6689

Conference

Conference13th International Conference on Digital Audio Effects, DAFx 2010
国/地域Austria
CityGraz
Period10/9/610/9/10

ASJC Scopus subject areas

  • 信号処理

フィンガープリント

「SpeakBySinging: Converting singing voices to speaking voices while retaining voice timbre」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル