TY - GEN
T1 - CNN-based virtual microphone signal estimation for MPDR beamforming in underdetermined situations
AU - Yamaoka, Kouei
AU - Li, Li L.
AU - Ono, Nobutaka
AU - Makino, Shoji
AU - Yamada, Takeshi
N1 - Funding Information:
ACKNOWLEDGMENTS This work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Numbers JP16H01735, JPH04131, and JP19J20420.
Publisher Copyright:
© 2019 IEEE
PY - 2019/9
Y1 - 2019/9
N2 - In this paper, we propose a novel approach to virtually increasing the number of microphone elements between two real microphones to improve speech enhancement performance in underdetermined situations. The virtual microphone technique, with which virtual signals in the audio signal domain are estimated by linearly interpolating the phase and nonlinearly interpolating the amplitude independently on the basis of βdivergence, has been recently proposed and experimentally shown to be effective in improving speech enhancement performance. Furthermore, it has been reported that the performance tends to improve as the nonlinearity is improved. However, one drawback of this method is that the interpolation is employed in each time-frequency bin independently, in which the spectral and temporal structures of speech signals are ignored. To address this problem and improve the nonlinearity, motivated by the high capability of neural networks to model nonlinear functions and speech spectrograms, in this paper, we propose an alternative method of amplitude interpolation. In this method, we employ a convolutional neural network as an amplitude estimator that minimizes the mean squared error between the outputs of a minimum power distortionless response (MPDR) beamformer and the target speech signals. The experimental results revealed that the proposed method showed high potential for improving speech enhancement performance, which was not only superior to that of the conventional virtual microphone technique but also the performance in the corresponding determined situation.
AB - In this paper, we propose a novel approach to virtually increasing the number of microphone elements between two real microphones to improve speech enhancement performance in underdetermined situations. The virtual microphone technique, with which virtual signals in the audio signal domain are estimated by linearly interpolating the phase and nonlinearly interpolating the amplitude independently on the basis of βdivergence, has been recently proposed and experimentally shown to be effective in improving speech enhancement performance. Furthermore, it has been reported that the performance tends to improve as the nonlinearity is improved. However, one drawback of this method is that the interpolation is employed in each time-frequency bin independently, in which the spectral and temporal structures of speech signals are ignored. To address this problem and improve the nonlinearity, motivated by the high capability of neural networks to model nonlinear functions and speech spectrograms, in this paper, we propose an alternative method of amplitude interpolation. In this method, we employ a convolutional neural network as an amplitude estimator that minimizes the mean squared error between the outputs of a minimum power distortionless response (MPDR) beamformer and the target speech signals. The experimental results revealed that the proposed method showed high potential for improving speech enhancement performance, which was not only superior to that of the conventional virtual microphone technique but also the performance in the corresponding determined situation.
UR - http://www.scopus.com/inward/record.url?scp=85075620779&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075620779&partnerID=8YFLogxK
U2 - 10.23919/EUSIPCO.2019.8903040
DO - 10.23919/EUSIPCO.2019.8903040
M3 - Conference contribution
AN - SCOPUS:85075620779
T3 - European Signal Processing Conference
BT - EUSIPCO 2019 - 27th European Signal Processing Conference
PB - European Signal Processing Conference, EUSIPCO
T2 - 27th European Signal Processing Conference, EUSIPCO 2019
Y2 - 2 September 2019 through 6 September 2019
ER -