Enhanced robot speech recognition based on microphone array source separation and missing feature theory

Shun'ichi Yamamoto*, Jean Marc Valin, Kazuhiro Nakadai, Jean Rouat, François Michaud, Tetsuya Ogata, Hiroshi G. Okuno

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

53 Citations (Scopus)

Abstract

A humanoid robot under real-world environments usually hears mixtures of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and recognition of separated sounds. While the first two are frequently addressed, the last one has not been studied so much. We present a system that gives a humanoid robot the ability to localize, separate and recognize simultaneous sound sources. A microphone array is used along with a real-time dedicated implementation of Geometric Source Separation (GSS) and a multi-channel post-filter that gives us a further reduction of interferences from other sources. An automatic speech recognizer (ASR) based on the Missing Feature Theory (MFT) recognizes separated sounds in real-time by generating missing feature masks automatically from the post-filtering step. The main advantage of this approach for humanoid robots resides in the fact that the ASR with a clean acoustic model can adapt the distortion of separated sound by consulting the post-filter feature masks. Recognition rates are presented for three simultaneous speakers located at 2m from the robot. Use of both the post-filter and the missing feature mask results in an average reduction in error rate of 42% (relative).

Original languageEnglish
Title of host publicationProceedings of the 2005 IEEE International Conference on Robotics and Automation
Pages1477-1482
Number of pages6
DOIs
Publication statusPublished - 2005
Externally publishedYes
Event2005 IEEE International Conference on Robotics and Automation - Barcelona, Spain
Duration: 2005 Apr 182005 Apr 22

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
Volume2005
ISSN (Print)1050-4729

Conference

Conference2005 IEEE International Conference on Robotics and Automation
Country/TerritorySpain
CityBarcelona
Period05/4/1805/4/22

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Enhanced robot speech recognition based on microphone array source separation and missing feature theory'. Together they form a unique fingerprint.

Cite this