TY - JOUR
T1 - A Media Conversion from Speech to Facial Image for Intelligent Man-Machine Interface
AU - Morishima, Shigeo
PY - 1991/5
Y1 - 1991/5
N2 - An automatic facial motion image synthesis scheme, driven by speech, and a real-time image synthesis design are presented. The purpose of this research is to realize an “intelligent” human-machine interface or “intelligent” communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme, and a voice-to-image conversion scheme are proposed in this paper. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the latter case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions, other than mouth shape and jaw position, also can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display, at a TV image video rate.
AB - An automatic facial motion image synthesis scheme, driven by speech, and a real-time image synthesis design are presented. The purpose of this research is to realize an “intelligent” human-machine interface or “intelligent” communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme, and a voice-to-image conversion scheme are proposed in this paper. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the latter case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions, other than mouth shape and jaw position, also can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display, at a TV image video rate.
UR - http://www.scopus.com/inward/record.url?scp=0026156861&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0026156861&partnerID=8YFLogxK
U2 - 10.1109/49.81953
DO - 10.1109/49.81953
M3 - Article
AN - SCOPUS:0026156861
SN - 0733-8716
VL - 9
SP - 594
EP - 600
JO - IEEE Journal on Selected Areas in Communications
JF - IEEE Journal on Selected Areas in Communications
IS - 4
ER -