Analysis of robustness of deep single-channel speech separation using corpora constructed from multiple domains

Matthew Maciejewski, Gregory Sell, Yusuke Fujita, Leibny Paola Garcia-Perera, Shinji Watanabe, Sanjeev Khudanpur

研究成果: Conference contribution

11 被引用数 (Scopus)

抄録

Deep-learning based single-channel speech separation has been studied with great success, though evaluations have typically been limited to relatively controlled environments based on clean, near-field, and read speech. This work investigates the robustness of such representative techniques in more realistic environments with multiple and diverse conditions. To this end, we first construct datasets from the Mixer 6 and CHiME-5 corpora, featuring studio interviews and dinner parties respectively, using a procedure carefully designed to generate desirable synthetic overlap data sufficient for evaluation as well as for training deep learning models. Using these new datasets, we demonstrate the substantial shortcomings in mismatched conditions of these separation techniques. Though multi-condition training greatly mitigated the performance degradation in near-field conditions, one of the important findings is that both matched and multi-condition training have significant gaps from the oracle performance in far-field conditions, which advocates a need for extending existing separation techniques to deal with far-field/highly-reverberant speech mixtures.

本文言語English
ホスト出版物のタイトル2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019
出版社Institute of Electrical and Electronics Engineers Inc.
ページ165-169
ページ数5
ISBN(電子版)9781728111230
DOI
出版ステータスPublished - 2019 10月
外部発表はい
イベント2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019 - New Paltz, United States
継続期間: 2019 10月 202019 10月 23

出版物シリーズ

名前IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
2019-October
ISSN(印刷版)1931-1168
ISSN(電子版)1947-1629

Conference

Conference2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019
国/地域United States
CityNew Paltz
Period19/10/2019/10/23

ASJC Scopus subject areas

  • 電子工学および電気工学
  • コンピュータ サイエンスの応用

フィンガープリント

「Analysis of robustness of deep single-channel speech separation using corpora constructed from multiple domains」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル