TY - GEN
T1 - Rectified linear unit can assist griffin-lim phase recovery
AU - Yatabe, Kohei
AU - Masuyama, Yoshiki
AU - Oikawa, Yasuhiro
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/2
Y1 - 2018/11/2
N2 - Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin-Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).
AB - Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin-Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).
KW - Consistency
KW - Harmonic regeneration
KW - Redundancy
KW - Spectrogram
KW - Time-domain nonlinearity
UR - http://www.scopus.com/inward/record.url?scp=85057416166&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057416166&partnerID=8YFLogxK
U2 - 10.1109/IWAENC.2018.8521304
DO - 10.1109/IWAENC.2018.8521304
M3 - Conference contribution
AN - SCOPUS:85057416166
T3 - 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings
SP - 555
EP - 559
BT - 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018
Y2 - 17 September 2018 through 20 September 2018
ER -