Real-time speech enhancement using equilibriated RNN

Daiki Takeuchi, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada

研究成果: Conference contribution

31 被引用数 (Scopus)

抄録

We propose a speech enhancement method using a causal deep neural network (DNN) for real-time applications. DNN has been widely used for estimating a time-frequency (T-F) mask which enhances a speech signal. One popular DNN structure for that is a recurrent neural network (RNN) owing to its capability of effectively modelling time-sequential data like speech. In particular, the long short-term memory (LSTM) is often used to alleviate the vanishing/exploding gradient problem which makes the training of an RNN difficult. However, the number of parameters of LSTM is increased as the price of mitigating the difficulty of training, which requires more computational resources. For real-time speech enhancement, it is preferable to use a smaller network without losing the performance. In this paper, we propose to use the equilibriated recurrent neural network (ERNN) for avoiding the vanishing/exploding gradient problem without increasing the number of parameters. The proposed structure is causal, which requires only the information from the past, in order to apply it in real-time. Compared to the uni- and bi-directional LSTM networks, the proposed method achieved the similar performance with much fewer parameters.

本文言語English
ホスト出版物のタイトル2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ851-855
ページ数5
ISBN(電子版)9781509066315
DOI
出版ステータスPublished - 2020 5月
イベント2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
継続期間: 2020 5月 42020 5月 8

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2020-May
ISSN(印刷版)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
国/地域Spain
CityBarcelona
Period20/5/420/5/8

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「Real-time speech enhancement using equilibriated RNN」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル