Auditory fovea based speech enchancement and its application to human-robot dialog system

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents an active direction-pass filter (ADPF) that separates sound from a specified direction by using a pair of microphones. Its application to front-end processing for speech recognition is also reported. The ADPF improves sound source separation by accurate sound direction obtained by multi-modal integration and active motor control that keeps the robot facing to a sound source, because the resolution of the center direction is much higher than that of peripherals, indicating similar property of visual fovea. In order to recognize separated sound streams, a Hidden Markov Model (HMM) based automatic speech recognition is built with multiple acoustic models trained by the output of the ADPF under various conditions. The experimental results by a preliminary dialog system prove that it works well even when two speakers speak simultaneously.

Original languageEnglish
Title of host publication7th International Conference on Spoken Language Processing, ICSLP 2002
PublisherInternational Speech Communication Association
Pages1817-1820
Number of pages4
Publication statusPublished - 2002
Externally publishedYes
Event7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States
Duration: 2002 Sept 162002 Sept 20

Other

Other7th International Conference on Spoken Language Processing, ICSLP 2002
Country/TerritoryUnited States
CityDenver
Period02/9/1602/9/20

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Auditory fovea based speech enchancement and its application to human-robot dialog system'. Together they form a unique fingerprint.

Cite this