TY - JOUR
T1 - A study on difference of codelengths between codes based on MDL principle and bayes codes for given prior distributions
AU - Gotoh, Masayuki
AU - Matsushima, Toshiyasu
AU - Hirasawa, Shigeichi
PY - 2001/1/1
Y1 - 2001/1/1
N2 - The principle of the Minimum Description Length (MDL) proposed by J. Rissanen provides a type of structure for the model estimation based on probabilistic model selection allowing minimization of the codelength. On the other hand, the use of Bayes codes makes it possible to find a coding function from a mix of probabilistic models without specifying any concrete model. It has been pointed out that codes based on the MDL principle (MDL codes) are closely related to Bayes theory because in the definition of the description length of the probabilistic model, an unknown prior distribution is assumed. In this paper, we apply asymptotic analysis to the codelength difference between the MDL codes and Bayes codes, including cases of different prior distributions. The results of the analysis clearly show that in the case of discrete model families, codes having a high prior distribution in true models (that is, the models for which an advantageous prior distribution is assumed) are favorable, but in the case of parametric model families, Bayes codes have shorter codelength than the MDL codes even in the cases of advantageous prior distribution assumed for the MDL codes.
AB - The principle of the Minimum Description Length (MDL) proposed by J. Rissanen provides a type of structure for the model estimation based on probabilistic model selection allowing minimization of the codelength. On the other hand, the use of Bayes codes makes it possible to find a coding function from a mix of probabilistic models without specifying any concrete model. It has been pointed out that codes based on the MDL principle (MDL codes) are closely related to Bayes theory because in the definition of the description length of the probabilistic model, an unknown prior distribution is assumed. In this paper, we apply asymptotic analysis to the codelength difference between the MDL codes and Bayes codes, including cases of different prior distributions. The results of the analysis clearly show that in the case of discrete model families, codes having a high prior distribution in true models (that is, the models for which an advantageous prior distribution is assumed) are favorable, but in the case of parametric model families, Bayes codes have shorter codelength than the MDL codes even in the cases of advantageous prior distribution assumed for the MDL codes.
KW - Bayes code
KW - Information source coding
KW - MDL principle
KW - Prior distribution
UR - http://www.scopus.com/inward/record.url?scp=0035127714&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0035127714&partnerID=8YFLogxK
U2 - 10.1002/1520-6440(200104)84:4<30::AID-ECJC4>3.0.CO;2-C
DO - 10.1002/1520-6440(200104)84:4<30::AID-ECJC4>3.0.CO;2-C
M3 - Article
AN - SCOPUS:0035127714
SN - 1042-0967
VL - 84
SP - 30
EP - 40
JO - Electronics and Communications in Japan, Part III: Fundamental Electronic Science (English translation of Denshi Tsushin Gakkai Ronbunshi)
JF - Electronics and Communications in Japan, Part III: Fundamental Electronic Science (English translation of Denshi Tsushin Gakkai Ronbunshi)
IS - 4
ER -