TY - GEN
T1 - Appearance-based human gesture recognition using multimodal features for human computer interaction
AU - Luo, Dan
AU - Gao, Hua
AU - Ekenel, Hazim Kemal
AU - Ohya, Jun
PY - 2011
Y1 - 2011
N2 - The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.
AB - The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.
KW - Condensation Algorithm
KW - Facial Expression
KW - Gesture Recognition
UR - http://www.scopus.com/inward/record.url?scp=79953719719&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79953719719&partnerID=8YFLogxK
U2 - 10.1117/12.872525
DO - 10.1117/12.872525
M3 - Conference contribution
AN - SCOPUS:79953719719
SN - 9780819484024
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XVI
T2 - Human Vision and Electronic Imaging XVI
Y2 - 24 January 2011 through 27 January 2011
ER -