Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech

Takuya Fujimura, Yuma Koizumi, Kohei Yatabe*, Ryoichi Miyazaki

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

Deep neural network (DNN)-based speech enhancement ordinarily requires clean speech signals as the training target. However, collecting clean signals is very costly because they must be recorded in a studio. This requirement currently restricts the amount of training data for speech enhancement to less than 1/1000 of that of speech recognition which does not need clean signals. Increasing the amount of training data is important for improving the performance, and hence the requirement of clean signals should be relaxed. In this paper, we propose a training strategy that does not require clean signals. The proposed method only utilizes noisy signals for training, which enables us to use a variety of speech signals in the wild. Our experimental results showed that the proposed method can achieve the performance similar to that of a DNN trained with clean signals.

Original languageEnglish
Title of host publication29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Pages436-440
Number of pages5
ISBN (Electronic)9789082797060
DOIs
Publication statusPublished - 2021
Event29th European Signal Processing Conference, EUSIPCO 2021 - Dublin, Ireland
Duration: 2021 Aug 232021 Aug 27

Publication series

NameEuropean Signal Processing Conference
Volume2021-August
ISSN (Print)2219-5491

Conference

Conference29th European Signal Processing Conference, EUSIPCO 2021
Country/TerritoryIreland
CityDublin
Period21/8/2321/8/27

Keywords

  • Deep neural network (DNN)
  • Noise2Noise
  • Single-channel speech enhancement
  • Training target

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech'. Together they form a unique fingerprint.

Cite this