TY - GEN
T1 - Improved sound source localization and front-back disambiguation for humanoid robots with two ears
AU - Kim, Ui Hyun
AU - Nakadai, Kazuhiro
AU - Okuno, Hiroshi G.
PY - 2013
Y1 - 2013
N2 - An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.
AB - An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.
KW - front-back disambiguation
KW - human-robot interaction
KW - Intelligent robot audition
KW - sound source localization
UR - http://www.scopus.com/inward/record.url?scp=84881379529&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84881379529&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-38577-3_29
DO - 10.1007/978-3-642-38577-3_29
M3 - Conference contribution
AN - SCOPUS:84881379529
SN - 9783642385766
VL - 7906 LNAI
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 282
EP - 291
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
T2 - 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013
Y2 - 17 June 2013 through 21 June 2013
ER -