TY - GEN
T1 - Joint Separation and Dereverberation of Reverberant Mixtures with Multichannel Variational Autoencoder
AU - Inoue, Shota
AU - Kameoka, Hirokazu
AU - Li, Li
AU - Seki, Shogo
AU - Makino, Shoji
N1 - Funding Information:
This work was supported by JSPS KAKENHI Grant Number 17H01763 and 18J20059, and SECOM Science and Technology Foundation.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - In this paper, we deal with a multichannel source separation problem under a highly reverberant condition. The multichan- nel variational autoencoder (MVAE) is a recently proposed source separation method that employs the decoder distribu- tion of a conditional VAE (CVAE) as the generative model for the complex spectrograms of the underlying source sig- nals. Although MVAE is notable in that it can significantly improve the source separation performance compared with conventional methods, its capability to separate highly rever- berant mixtures is still limited since MVAE uses an instan- taneous mixture model. To overcome this limitation, in this paper we propose extending MVAE to simultaneously solve source separation and dereverberation problems by formulat- ing the separation system as a frequency-domain convolutive mixture model. A convergence-guaranteed algorithm based on the coordinate descent method is derived for the optimiza- tion. Experimental results revealed that the proposed method outperformed the conventional methods in terms of all the source separation criteria in highly reverberant environments.
AB - In this paper, we deal with a multichannel source separation problem under a highly reverberant condition. The multichan- nel variational autoencoder (MVAE) is a recently proposed source separation method that employs the decoder distribu- tion of a conditional VAE (CVAE) as the generative model for the complex spectrograms of the underlying source sig- nals. Although MVAE is notable in that it can significantly improve the source separation performance compared with conventional methods, its capability to separate highly rever- berant mixtures is still limited since MVAE uses an instan- taneous mixture model. To overcome this limitation, in this paper we propose extending MVAE to simultaneously solve source separation and dereverberation problems by formulat- ing the separation system as a frequency-domain convolutive mixture model. A convergence-guaranteed algorithm based on the coordinate descent method is derived for the optimiza- tion. Experimental results revealed that the proposed method outperformed the conventional methods in terms of all the source separation criteria in highly reverberant environments.
KW - Blind source separation
KW - blind derever- beration
KW - multichannel audio signal processing
KW - multichannel variational autoencoder (MVAE)
UR - http://www.scopus.com/inward/record.url?scp=85067254083&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067254083&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8683497
DO - 10.1109/ICASSP.2019.8683497
M3 - Conference contribution
AN - SCOPUS:85067254083
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 96
EP - 100
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Y2 - 12 May 2019 through 17 May 2019
ER -