TY - GEN
T1 - Fast MVAE
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
AU - Li, Li
AU - Kameoka, Hirokazu
AU - Makino, Shoji
N1 - Funding Information:
This work was supported by JSPS KAKENHI Grant Number 17H01763 and 18J20059, and SECOM Science and Technology Foundation.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - This paper proposes an alternative algorithm for the multi-channel variational autoencoder (MVAE), a recently proposed multichannel source separation approach. While MVAE is notable for its impressive source separation performance, its convergence-guaranteed optimization algorithm and the fact that it allows us to estimate source-class labels simultaneously with source separation, there are still two major drawbacks, namely, the high computational complexity and the unsatisfactory source classification accuracy. To overcome these drawbacks, the proposed method employs an auxiliary classifier VAE, which is an information-theoretic extension of the conditional VAE, for learning the generative model of the source spectrograms. Furthermore, with the trained auxiliary classifier, we introduce a novel algorithm for the optimization that can both reduce the computational time and improve the source classification performance. We call the proposed method fast MVAE (fMVAE) . Experimental evaluations revealed that fMVAE achieved source separation performance comparable to that of MVAE and a source classification accu-racy rate of about 80% while reducing computational time by about 93%.
AB - This paper proposes an alternative algorithm for the multi-channel variational autoencoder (MVAE), a recently proposed multichannel source separation approach. While MVAE is notable for its impressive source separation performance, its convergence-guaranteed optimization algorithm and the fact that it allows us to estimate source-class labels simultaneously with source separation, there are still two major drawbacks, namely, the high computational complexity and the unsatisfactory source classification accuracy. To overcome these drawbacks, the proposed method employs an auxiliary classifier VAE, which is an information-theoretic extension of the conditional VAE, for learning the generative model of the source spectrograms. Furthermore, with the trained auxiliary classifier, we introduce a novel algorithm for the optimization that can both reduce the computational time and improve the source classification performance. We call the proposed method fast MVAE (fMVAE) . Experimental evaluations revealed that fMVAE achieved source separation performance comparable to that of MVAE and a source classification accu-racy rate of about 80% while reducing computational time by about 93%.
KW - Multichannel source separation
KW - auxiliary classifier
KW - multi-channel variational autoencoder
KW - source classification
UR - http://www.scopus.com/inward/record.url?scp=85069005104&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85069005104&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8682623
DO - 10.1109/ICASSP.2019.8682623
M3 - Conference contribution
AN - SCOPUS:85069005104
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 546
EP - 550
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 May 2019 through 17 May 2019
ER -