TY - GEN
T1 - A GMM sound source model for blind speech separation in under-determined conditions
AU - Hirasawa, Yasuharu
AU - Yasuraoka, Naoki
AU - Takahashi, Toru
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2012
Y1 - 2012
N2 - This paper focuses on blind speech separation in under-determined conditions, that is, in the case when there are more sound sources than microphones. We introduce a sound source model based on the Gaussian mixture model (GMM) to represent a speech signal in the time-frequency domain, and derive rules for updating the model parameters using the auxiliary function method. Our GMM sound source model consists of two kinds of Gaussians: sharp ones representing harmonic parts and smooth ones representing nonharmonic parts. Experimental results reveal that our method outperforms the method based on non-negative matrix factorization (NMF) by 0.7dB in the signal-to-distortion ratio (SDR), and by 1.7dB in the signal-to-interference ratio (SIR). This means that our method effectively removes interference coming from other talkers.
AB - This paper focuses on blind speech separation in under-determined conditions, that is, in the case when there are more sound sources than microphones. We introduce a sound source model based on the Gaussian mixture model (GMM) to represent a speech signal in the time-frequency domain, and derive rules for updating the model parameters using the auxiliary function method. Our GMM sound source model consists of two kinds of Gaussians: sharp ones representing harmonic parts and smooth ones representing nonharmonic parts. Experimental results reveal that our method outperforms the method based on non-negative matrix factorization (NMF) by 0.7dB in the signal-to-distortion ratio (SDR), and by 1.7dB in the signal-to-interference ratio (SIR). This means that our method effectively removes interference coming from other talkers.
KW - Auxiliary function method
KW - Blind speech separation
KW - GMM sound source model
KW - Under-determined condition
UR - http://www.scopus.com/inward/record.url?scp=84863115686&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863115686&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-28551-6_55
DO - 10.1007/978-3-642-28551-6_55
M3 - Conference contribution
AN - SCOPUS:84863115686
SN - 9783642285509
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 446
EP - 453
BT - Latent Variable Analysis and Signal Separation - 10th International Conference, LVA/ICA 2012, Proceedings
T2 - 10th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2012
Y2 - 12 March 2012 through 15 March 2012
ER -