Song2Face: Synthesizing Singing Facial Animation from Audio

Shohei Iwase, Takuya Kato, Shugo Yamaguchi, Tsuchiya Yukitaka, Shigeo Morishima

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show that vocal features can be directly learned from singing voice without any explicit constraints. Our network is capable of producing movements for all parts of the face and also rotational movement of the head itself. Furthermore, stylistic differences in expression between different singers are captured via the singer label, and thus the resulting animations singing style can be manipulated at test time.

本文言語English
ホスト出版物のタイトルSIGGRAPH Asia 2020 Technical Communications, SA 2020
出版社Association for Computing Machinery, Inc
ISBN(電子版)9781450380805
DOI
出版ステータスPublished - 2020 12月 1
イベントSIGGRAPH Asia 2020 Technical Communications - International Conference on Computer Graphics and Interactive Techniques, SA 2020 - Virtual, Online, Korea, Republic of
継続期間: 2020 12月 42020 12月 13

出版物シリーズ

名前SIGGRAPH Asia 2020 Technical Communications, SA 2020

Conference

ConferenceSIGGRAPH Asia 2020 Technical Communications - International Conference on Computer Graphics and Interactive Techniques, SA 2020
国/地域Korea, Republic of
CityVirtual, Online
Period20/12/420/12/13

ASJC Scopus subject areas

  • コンピュータ グラフィックスおよびコンピュータ支援設計
  • ソフトウェア
  • コンピュータ ビジョンおよびパターン認識

フィンガープリント

「Song2Face: Synthesizing Singing Facial Animation from Audio」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル