TY - GEN
T1 - Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter
AU - Kim, Hyun Don
AU - Komatani, Kazunori
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2007
Y1 - 2007
N2 - This paper presents techniques that enable a talker tracking for effective human-robot interaction. We propose new way of integrating an EM algorithm and a particle filter to select an appropriate path for tracking the talker. It can easily adapt to new kinds of information for tracking the talker with our system. This is because our system estimates the position of the desired talker through means, variances, and weights calculated from EM training regardless of the numbers or kinds of information. In addition, to enhance a robot's ability to track a talker in real-world environments, we applied the particle filter to talker tracking after executing the EM algorithm. We also integrated a variety of auditory and visual information regarding sound localization, face localization, and the detection of lip movement. Moreover, we applied a sound classification function that allows our system to distinguish between voice, music, or noise. We also developed a vision module that can locate moving objects.
AB - This paper presents techniques that enable a talker tracking for effective human-robot interaction. We propose new way of integrating an EM algorithm and a particle filter to select an appropriate path for tracking the talker. It can easily adapt to new kinds of information for tracking the talker with our system. This is because our system estimates the position of the desired talker through means, variances, and weights calculated from EM training regardless of the numbers or kinds of information. In addition, to enhance a robot's ability to track a talker in real-world environments, we applied the particle filter to talker tracking after executing the EM algorithm. We also integrated a variety of auditory and visual information regarding sound localization, face localization, and the detection of lip movement. Moreover, we applied a sound classification function that allows our system to distinguish between voice, music, or noise. We also developed a vision module that can locate moving objects.
KW - EM
KW - Human-robot interaction
KW - Lip movement detection
KW - Particle filter
KW - Sound source localization
KW - Talker tracking
UR - http://www.scopus.com/inward/record.url?scp=37249024128&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=37249024128&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-73325-6_28
DO - 10.1007/978-3-540-73325-6_28
M3 - Conference contribution
AN - SCOPUS:37249024128
SN - 9783540733225
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 280
EP - 290
BT - New Trends in Applied Artificial Intelligence - 20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE 2007, Proceedings
PB - Springer Verlag
T2 - 20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007
Y2 - 26 June 2007 through 29 June 2007
ER -