TY - JOUR
T1 - Binaural active audition for humanoid robots to localise speech over entire azimuth range
AU - Kim, Hyun Don
AU - Komatani, Kazunori
AU - Ogata, Tetsuya
AU - Okuno, HIROShi G.
N1 - Funding Information:
Kim Hyun-Don 1 Komatani Kazunori 1 Ogata Tetsuya 1 Okuno Hiroshi G. 1 Department of Intelligence Science and Technology Graduate School of Informatics Kyoto University Yoshida-honmachi Kyoto Japan kyoto-u.ac.jp 2009 6 3-4 355 367 10 5 2009 2009 Copyright © 2009 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We applied motion theory to robot audition to improve the inadequate performance. Motions are critical for overcoming the ambiguity and sparseness of information obtained by two microphones. To realise this, we first designed a sound source localisation system integrated with cross-power spectrum phase (CSP) analysis and an EM algorithm. The CSP of sound signals obtained with only two microphones was used to localise the sound source without having to measure impulse response data. The expectation-maximisation (EM) algorithm helped the system to cope with several moving sound sources and reduce localisation errors. We then proposed a way of constructing a database for moving sounds to evaluate binaural sound source localisation. We evaluated our sound localisation method using artificial moving sounds and confirmed that it could effectively localise moving sounds slower than 1.125 rad/s. Consequently, we solved the problem of distinguishing whether sounds were coming from the front or rear by rotating and/or tipping the robot's head that was equipped with only two microphones. Our system was applied to a humanoid robot called SIG2, and we confirmed its ability to localise sounds over the entire azimuth range as the success rates for sound localisation in the front and rear areas were 97.6% and 75.6% respectively. active audition sound source localisation humanoid robots human-robot interaction http://dx.doi.org/10.13039/501100001700 Ministry of Education, Culture, Sports, Science, and Technology
PY - 2009/9
Y1 - 2009/9
N2 - We applied motion theory to robot audition to improve the inadequate performance. Motions are critical for overcoming the ambiguity and sparseness of information obtained by two microphones. To realise this, we first designed a sound source localisation system integrated with cross-power spectrum phase (CSP) analysis and an EM algorithm. The CSP of sound signals obtained with only two microphones was used to localise the sound source without having to measure impulse response data. The expectation-maximisation (EM) algorithm helped the system to cope with several moving sound sources and reduce localisation errors. We then proposed a way of constructing a database for moving sounds to evaluate binaural sound source localisation. We evaluated our sound localisation method using artificial moving sounds and confirmed that it could effectively localise moving sounds slower than 1.125 rad/s. Consequently, we solved the problem of distinguishing whether sounds were coming from the front or rear by rotating and/or tipping the robot's head that was equipped with only two microphones. Our system was applied to a humanoid robot called SIG2, and we confirmed its ability to localise sounds over the entire azimuth range as the success rates for sound localisation in the front and rear areas were 97.6% and 75.6% respectively.
AB - We applied motion theory to robot audition to improve the inadequate performance. Motions are critical for overcoming the ambiguity and sparseness of information obtained by two microphones. To realise this, we first designed a sound source localisation system integrated with cross-power spectrum phase (CSP) analysis and an EM algorithm. The CSP of sound signals obtained with only two microphones was used to localise the sound source without having to measure impulse response data. The expectation-maximisation (EM) algorithm helped the system to cope with several moving sound sources and reduce localisation errors. We then proposed a way of constructing a database for moving sounds to evaluate binaural sound source localisation. We evaluated our sound localisation method using artificial moving sounds and confirmed that it could effectively localise moving sounds slower than 1.125 rad/s. Consequently, we solved the problem of distinguishing whether sounds were coming from the front or rear by rotating and/or tipping the robot's head that was equipped with only two microphones. Our system was applied to a humanoid robot called SIG2, and we confirmed its ability to localise sounds over the entire azimuth range as the success rates for sound localisation in the front and rear areas were 97.6% and 75.6% respectively.
KW - Active audition
KW - Human-robot interaction
KW - Humanoid robots
KW - Sound source localisation
UR - http://www.scopus.com/inward/record.url?scp=77949409699&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77949409699&partnerID=8YFLogxK
U2 - 10.1080/11762320903007430
DO - 10.1080/11762320903007430
M3 - Article
AN - SCOPUS:77949409699
SN - 1176-2322
VL - 6
SP - 355
EP - 367
JO - Applied Bionics and Biomechanics
JF - Applied Bionics and Biomechanics
IS - 3-4
ER -