TY - GEN
T1 - Sparse Bayesian Hierarchical Mixture of Experts and Variational Inference
AU - Iikubo, Yuji
AU - Horii, Shunsuke
AU - Matsushima, Toshiyasu
PY - 2019/3/8
Y1 - 2019/3/8
N2 - The hierarchical mixture of experts (HME) is a tree-structured probabilistic model for regression and classification. The HME has a considerable expression capability, however, the estimation of the parameters tends to overfit due to the complexity of the model. To avoid this problem, regularization techniques are widely used. In particular, it is known that a sparse solution can be obtained by L1 regularization. From a Bayesian point of view, regularization techniques are equivalent to assume that the parameters follow prior distributions and find the maximum a posteriori probability estimator. It is known that L1 regularization is equivalent to assuming Laplace distributions as prior distributions. However, it is difficult to compute the posterior distribution if Laplace distributions are assumed. In this paper, we assume that the parameters of the HME follow hierarchical prior distributions which are equivalent to Laplace distribution to promote sparse solutions. We propose a Bayesian estimation algorithm based on the variational method. Finally, the proposed algorithm is evaluated by computer simulations.
AB - The hierarchical mixture of experts (HME) is a tree-structured probabilistic model for regression and classification. The HME has a considerable expression capability, however, the estimation of the parameters tends to overfit due to the complexity of the model. To avoid this problem, regularization techniques are widely used. In particular, it is known that a sparse solution can be obtained by L1 regularization. From a Bayesian point of view, regularization techniques are equivalent to assume that the parameters follow prior distributions and find the maximum a posteriori probability estimator. It is known that L1 regularization is equivalent to assuming Laplace distributions as prior distributions. However, it is difficult to compute the posterior distribution if Laplace distributions are assumed. In this paper, we assume that the parameters of the HME follow hierarchical prior distributions which are equivalent to Laplace distribution to promote sparse solutions. We propose a Bayesian estimation algorithm based on the variational method. Finally, the proposed algorithm is evaluated by computer simulations.
UR - http://www.scopus.com/inward/record.url?scp=85063870175&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063870175&partnerID=8YFLogxK
U2 - 10.23919/ISITA.2018.8664333
DO - 10.23919/ISITA.2018.8664333
M3 - Conference contribution
AN - SCOPUS:85063870175
T3 - Proceedings of 2018 International Symposium on Information Theory and Its Applications, ISITA 2018
SP - 60
EP - 64
BT - Proceedings of 2018 International Symposium on Information Theory and Its Applications, ISITA 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th International Symposium on Information Theory and Its Applications, ISITA 2018
Y2 - 28 October 2018 through 31 October 2018
ER -