TY - JOUR
T1 - Using short dependency relations from auto-parsed data for Chinese dependency parsing
AU - Chen, Wenliang
AU - Kawahara, Daisuke
AU - Uchimoto, Kiyotaka
AU - Zhang, Yujie
AU - Isahara, Hitoshi
PY - 2009/8/1
Y1 - 2009/8/1
N2 - Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available. However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52%, an absolute 1.24% improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.
AB - Dependency parsing has become increasingly popular for a surge of interest lately for applications such as machine translation and question answering. Currently, several supervised learning methods can be used for training high-performance dependency parsers if sufficient labeled data are available. However, currently used statistical dependency parsers provide poor results for words separated by long distances. In order to solve this problem, this article presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data. The unlabeled data is automatically parsed by using a deterministic dependency parser, which exhibits a relatively high performance for short dependencies between words. We then train another parser that uses the information on short dependency relations extracted from the output of the first parser. The proposed approach achieves an unlabeled attachment score of 86.52%, an absolute 1.24% improvement over the baseline system on the Chinese Treebank data set. The results indicate that the proposed approach improves the parsing performance for longer distance words.
KW - Chinese dependency parsing
KW - Semi-supervised learning
KW - Unlabeled data
UR - http://www.scopus.com/inward/record.url?scp=70349084103&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349084103&partnerID=8YFLogxK
U2 - 10.1145/1568292.1568293
DO - 10.1145/1568292.1568293
M3 - Article
AN - SCOPUS:70349084103
SN - 1530-0226
VL - 8
JO - ACM Transactions on Asian Language Information Processing
JF - ACM Transactions on Asian Language Information Processing
IS - 3
M1 - 10
ER -