Abstract
Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called "HARK", which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called "FlowDesigner" to share intermediate audio data, which provides real-time processing. HARK's performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.
Original language | English |
---|---|
Title of host publication | 2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 |
Pages | 561-566 |
Number of pages | 6 |
DOIs | |
Publication status | Published - 2008 |
Externally published | Yes |
Event | 2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 - Daejeon Duration: 2008 Dec 1 → 2008 Dec 3 |
Other
Other | 2008 8th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2008 |
---|---|
City | Daejeon |
Period | 08/12/1 → 08/12/3 |
ASJC Scopus subject areas
- Artificial Intelligence
- Computer Vision and Pattern Recognition
- Human-Computer Interaction