TY - JOUR
T1 - Application of variational Bayesian estimation and clustering to acoustic model adaptation
AU - Watanabe, Shinji
AU - Minami, Yasuhiro
AU - Nakamura, Atsushi
AU - Ueda, Naonori
N1 - Funding Information:
Monsur Islam acknowledges support from Hitachi though Hitachi High Technologies Electron Microscopy Fellowship. The authors are thankful to the Institute for Biological Interfaces of Engineering for facilitating the mechanical testing and to Prof. Suyi Li at Clemson University for discussion regarding the structure of Miura-ori.
PY - 2003
Y1 - 2003
N2 - In this paper, we apply Variational Bayesian Estimation and Clustering for speech recognition (VBEC) to an acoustic model adaptation. VBEC can estimate parameter posteriors even when a model includes hidden variables, by using Variational Bayesian approach. In addition, VBEC can select an appropriate model structure in clustering triphone states, according to the amount of available adaptation data. Unlike a conventional Bayesian method such as Maximum A Posteriori (MAP), VBEC is useful even in the case of small amounts of data, because the amount of data per one Gaussian increases due to the model structure selection, and over-training is suppressed. We conduct an off-line supervised adaptation experiment on isolated word recognition, and show the advantage of the proposed method over the conventional method, especially when dealing with small amounts of adaptation data.
AB - In this paper, we apply Variational Bayesian Estimation and Clustering for speech recognition (VBEC) to an acoustic model adaptation. VBEC can estimate parameter posteriors even when a model includes hidden variables, by using Variational Bayesian approach. In addition, VBEC can select an appropriate model structure in clustering triphone states, according to the amount of available adaptation data. Unlike a conventional Bayesian method such as Maximum A Posteriori (MAP), VBEC is useful even in the case of small amounts of data, because the amount of data per one Gaussian increases due to the model structure selection, and over-training is suppressed. We conduct an off-line supervised adaptation experiment on isolated word recognition, and show the advantage of the proposed method over the conventional method, especially when dealing with small amounts of adaptation data.
UR - http://www.scopus.com/inward/record.url?scp=0141591454&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0141591454&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:0141591454
SN - 1520-6149
VL - 1
SP - 568
EP - 571
JO - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
JF - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
T2 - 2003 IEEE International Conference on Accoustics, Speech, and Signal Processing
Y2 - 6 April 2003 through 10 April 2003
ER -