Online integration of DNN-Based and spatial clustering-based mask estimation for robust MVDR beamforming

Yutaro Matsui, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Nobutaka Ito, Shoko Araki, Shoji Makino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

This paper discusses the online estimation of time- frequency masks, which enables us to perform mask-based beamforming by online processing for robust automatic speech recognition (ASR). Two approaches to online mask estimation have been separately developed for this purpose. One is based on a deep neural network (DNN), which exploits the spectral features of the signal. The other is based on spatial clustering (SC), which exploits the spatial features of the signal. This paper proposes a new method that integrates the two online estimation approaches to further improve online mask estimation by exploiting the advantages of both approaches. Experiments using the real data of the CHiME-3 multichannel noisy speech corpus show that the proposed method greatly outperforms the conventional approaches in terms of improving the word error rate (WER).

Original languageEnglish
Title of host publication16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages71-75
Number of pages5
ISBN (Electronic)9781538681510
DOIs
Publication statusPublished - 2018 Nov 2
Externally publishedYes
Event16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Tokyo, Japan
Duration: 2018 Sept 172018 Sept 20

Publication series

Name16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings

Other

Other16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018
Country/TerritoryJapan
CityTokyo
Period18/9/1718/9/20

Keywords

  • Beamforming
  • DNN
  • Online processing
  • Spatial clustering
  • Time-frequency masking

ASJC Scopus subject areas

  • Signal Processing
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Online integration of DNN-Based and spatial clustering-based mask estimation for robust MVDR beamforming'. Together they form a unique fingerprint.

Cite this