A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, Humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. In this paper, we recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.
|Number of pages||6|
|Publication status||Published - 2004 Dec 1|
|Event||RO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication - Okayama, Japan|
Duration: 2004 Sept 20 → 2004 Sept 22
|Conference||RO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication|
|Period||04/9/20 → 04/9/22|
ASJC Scopus subject areas