TY - JOUR
T1 - The effect of corpus size on case frame acquisition for predicate-argument structure analysis
AU - Sasano, Ryohei
AU - Kawahara, Daisuke
AU - Kurohashi, Sadao
PY - 2010/6
Y1 - 2010/6
N2 - This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.
AB - This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.
KW - Case frame
KW - Corpus size
KW - Predicate-argument structure analysis
UR - http://www.scopus.com/inward/record.url?scp=77952984545&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952984545&partnerID=8YFLogxK
U2 - 10.1587/transinf.E93.D.1361
DO - 10.1587/transinf.E93.D.1361
M3 - Article
AN - SCOPUS:77952984545
SN - 0916-8532
VL - E93-D
SP - 1361
EP - 1368
JO - IEICE Transactions on Information and Systems
JF - IEICE Transactions on Information and Systems
IS - 6
ER -