TY - GEN
T1 - Topic-dependent N-gram models based on optimization of context lengths in LDA
AU - Nakamura, Akira
AU - Hayamizu, Satoru
PY - 2010
Y1 - 2010
N2 - This paper describes a method that improves accuracy of N-gram language models which can be applied to on-line applications. The precision of a long-distance language model including LDA is influenced by a context length, or a length of the history used for prediction. In the proposed method, each of multiple LDA units estimates an optimum context length separately, then those predictions are integrated and N-gram probabilities are calculated. The method directly estimates the optimum context length suitable for prediction. Results show the method improves topic-dependent N-gram probabilities, particularly of a word related to specific topics, yielding higher and more stable performance comparing to an existing method.
AB - This paper describes a method that improves accuracy of N-gram language models which can be applied to on-line applications. The precision of a long-distance language model including LDA is influenced by a context length, or a length of the history used for prediction. In the proposed method, each of multiple LDA units estimates an optimum context length separately, then those predictions are integrated and N-gram probabilities are calculated. The method directly estimates the optimum context length suitable for prediction. Results show the method improves topic-dependent N-gram probabilities, particularly of a word related to specific topics, yielding higher and more stable performance comparing to an existing method.
KW - LDA
KW - Language model
KW - Topic model
UR - http://www.scopus.com/inward/record.url?scp=79959822081&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79959822081&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:79959822081
T3 - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
SP - 3066
EP - 3069
BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PB - International Speech Communication Association
ER -