TY - JOUR
T1 - Automatic determination of acoustic model topology using variational Bayesian estimation and clustering
AU - Watanabe, Shinji
AU - Sako, Atsushi
AU - Nakamura, Atsushi
PY - 2004/9/28
Y1 - 2004/9/28
N2 - We describe the automatic determination of an acoustic model for speech recognition, which is very complicated and includes latent variables, using VBEC: Variational Bayesian Estimation and Clustering for speech recognition. We propose an efficient Gaussian Mixture Model (GMM) based phonetic decision tree construction within the VBEC framework. The proposed method features a novel approach to reduce the unrealistically large number of computations needed for iterative calculations in the GMM-based decision tree method to a practical level by assuming that each Gaussian per state has the same occupancy and is represented by the same posterior distribution for the covariance parameter. The experimental results confirmed that VBEC automatically provided a optimum model topology with the highest performance level.
AB - We describe the automatic determination of an acoustic model for speech recognition, which is very complicated and includes latent variables, using VBEC: Variational Bayesian Estimation and Clustering for speech recognition. We propose an efficient Gaussian Mixture Model (GMM) based phonetic decision tree construction within the VBEC framework. The proposed method features a novel approach to reduce the unrealistically large number of computations needed for iterative calculations in the GMM-based decision tree method to a practical level by assuming that each Gaussian per state has the same occupancy and is represented by the same posterior distribution for the covariance parameter. The experimental results confirmed that VBEC automatically provided a optimum model topology with the highest performance level.
UR - http://www.scopus.com/inward/record.url?scp=4544387676&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=4544387676&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:4544387676
SN - 0736-7791
VL - 1
SP - I813-I816
JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
T2 - Proceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing
Y2 - 17 May 2004 through 21 May 2004
ER -