Video translation system using face tracking and lip synchronization

S. Morishima*, Shin Ogata, S. Nakamura

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database. Also, we propose a method to track motion of the face from the video image. In this system, movement and rotation of the head is detected by template matching using a 3D personal face wire-frame model. By this technique, an automatic video translation can be achieved.

Original languageEnglish
Title of host publicationProceedings - IEEE International Conference on Multimedia and Expo
PublisherIEEE Computer Society
Pages649-652
Number of pages4
ISBN (Electronic)0769511988
DOIs
Publication statusPublished - 2001 Jan 1
Externally publishedYes
Event2001 IEEE International Conference on Multimedia and Expo, ICME 2001 - Tokyo, Japan
Duration: 2001 Aug 222001 Aug 25

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Other

Other2001 IEEE International Conference on Multimedia and Expo, ICME 2001
Country/TerritoryJapan
CityTokyo
Period01/8/2201/8/25

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Video translation system using face tracking and lip synchronization'. Together they form a unique fingerprint.

Cite this