TY - GEN
T1 - SpeakBySinging
T2 - 13th International Conference on Digital Audio Effects, DAFx 2010
AU - Aso, Shimpei
AU - Saitou, Takeshi
AU - Goto, Masataka
AU - Itoyama, Katsutoshi
AU - Takahashi, Toru
AU - Komatani, Kazunori
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2010
Y1 - 2010
N2 - This paper describes a singing-to-speaking synthesis system called "SpeakBySinging" that can synthesize a speaking voice from an input singing voice and the song lyrics. The system controls three acoustic features that determine the difference between speaking and singing voices: the fundamental frequency (F0), phoneme duration, and power (volume). By changing these features of a singing voice, the system synthesizes a speaking voice while retaining the timbre of the singing voice. The system first analyzes the singing voice to extract the F0 contour, the duration of each phoneme of the lyrics, and the power. These features are then converted to target values that are obtained by feeding the lyrics into a traditional text-to-speech (TTS) system. The system finally generates a speaking voice that preserves the timbre of the singing voice but has speech-like features. Experimental results show that SpeakBySinging can convert singing voices into speaking voices whose timbre is almost the same as the original singing voices.
AB - This paper describes a singing-to-speaking synthesis system called "SpeakBySinging" that can synthesize a speaking voice from an input singing voice and the song lyrics. The system controls three acoustic features that determine the difference between speaking and singing voices: the fundamental frequency (F0), phoneme duration, and power (volume). By changing these features of a singing voice, the system synthesizes a speaking voice while retaining the timbre of the singing voice. The system first analyzes the singing voice to extract the F0 contour, the duration of each phoneme of the lyrics, and the power. These features are then converted to target values that are obtained by feeding the lyrics into a traditional text-to-speech (TTS) system. The system finally generates a speaking voice that preserves the timbre of the singing voice but has speech-like features. Experimental results show that SpeakBySinging can convert singing voices into speaking voices whose timbre is almost the same as the original singing voices.
UR - http://www.scopus.com/inward/record.url?scp=85139271179&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85139271179&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85139271179
SN - 9783200019409
T3 - Proceedings of the International Conference on Digital Audio Effects, DAFx
BT - 13th International Conference on Digital Audio Effects, DAFx 2010 Proceedings
Y2 - 6 September 2010 through 10 September 2010
ER -