Supervised determined source separation with multichannel variational autoencoder

Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino

Research output: Contribution to journalLetterpeer-review

55 Citations (Scopus)

Abstract

This letter proposes a multichannel source separation technique, the multichannel variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model and estimate the power spectrograms of the sources in a mixture. By training the CVAE using the spectrograms of training examples with source-class labels, we can use the trained decoder distribution as a universal generative model capable of generating spectrograms conditioned on a specified class index. By treating the latent space variables and the class index as the unknown parameters of this generative model, we can develop a convergence-guaranteed algorithm for supervised determined source separation that consists of iter-atively estimating the power spectrograms of the underlying sources, as well as the separation matrices. In experimental evaluations, our MVAE produced better separation performance than a baseline method.

Original languageEnglish
Pages (from-to)1891-1914
Number of pages24
JournalNeural Computation
Volume31
Issue number9
DOIs
Publication statusPublished - 2019 Sept 1
Externally publishedYes

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'Supervised determined source separation with multichannel variational autoencoder'. Together they form a unique fingerprint.

Cite this