TY - GEN
T1 - Determined audio source separation with multichannel star generative adversarial network
AU - Li, Li
AU - Kameoka, Hirokazu
AU - Makino, Shoji
N1 - Funding Information:
This work was supported by JSPS KAKENHI 18J20059 and 19H04131, and JST CREST JPMJCR19A3.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/9
Y1 - 2020/9
N2 - This paper proposes a multichannel source separation approach, which uses a star generative adversarial network (StarGAN) to model power spectrograms of sources. Various studies have shown the significant contributions of a precise source model to the performance improvement in audio source separation, which indicates the importance of developing a better source model. In this paper, we explore the potential of StarGAN for modeling source spectrograms and investigate the effectiveness of the StarGAN source model in determined multichannel source separation by incorporating it into a frequency-domain independent component analysis (ICA) framework. The experimental results reveal that the proposed StarGAN-based method outperformed conventional methods that use non-negative matrix factorization (NMF) or a variational autoencoder (VAE) for source spectrogram modeling.
AB - This paper proposes a multichannel source separation approach, which uses a star generative adversarial network (StarGAN) to model power spectrograms of sources. Various studies have shown the significant contributions of a precise source model to the performance improvement in audio source separation, which indicates the importance of developing a better source model. In this paper, we explore the potential of StarGAN for modeling source spectrograms and investigate the effectiveness of the StarGAN source model in determined multichannel source separation by incorporating it into a frequency-domain independent component analysis (ICA) framework. The experimental results reveal that the proposed StarGAN-based method outperformed conventional methods that use non-negative matrix factorization (NMF) or a variational autoencoder (VAE) for source spectrogram modeling.
KW - Deep generative model
KW - Determined source separation
KW - Multichannel audio signal processing
KW - Spectrogram modeling
KW - Star generative adversarial network (StarGAN)
UR - http://www.scopus.com/inward/record.url?scp=85096485442&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096485442&partnerID=8YFLogxK
U2 - 10.1109/MLSP49062.2020.9231555
DO - 10.1109/MLSP49062.2020.9231555
M3 - Conference contribution
AN - SCOPUS:85096485442
T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP
BT - Proceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing, MLSP 2020
PB - IEEE Computer Society
T2 - 30th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2020
Y2 - 21 September 2020 through 24 September 2020
ER -