TY - GEN
T1 - Multi-view bootstrapping for relation extraction by exploring web features and linguistic features
AU - Yan, Yulan
AU - Li, Haibo
AU - Matsuo, Yutaka
AU - Ishizuka, Mitsuru
PY - 2010
Y1 - 2010
N2 - Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other.
AB - Binary semantic relation extraction from Wikipedia is particularly useful for various NLP and Web applications. Currently frequent pattern miningbased methods and syntactic analysis-based methods are two types of leading methods for semantic relation extraction task. With a novel view on integrating syntactic analysis on Wikipedia text with redundancy information from the Web, we propose a multi-view learning approach for bootstrapping relationships between entities with the complementary between theWeb view and linguistic view. On the one hand, from the linguistic view, linguistic features are generated from linguistic parsing on Wikipedia texts by abstracting away from different surface realizations of semantic relations. On the other hand, Web features are extracted from the Web corpus to provide frequency information for relation extraction. Experimental evaluation on a relational dataset demonstrates that linguistic analysis on Wikipedia texts and Web collective information reveal different aspects of the nature of entity-related semantic relationships. It also shows that our multiview learning method considerably boosts the performance comparing to learning with only one view of features, with the weaknesses of one view complement the strengths of the other.
UR - http://www.scopus.com/inward/record.url?scp=78049292163&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78049292163&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-12116-6_45
DO - 10.1007/978-3-642-12116-6_45
M3 - Conference contribution
AN - SCOPUS:78049292163
SN - 3642121152
SN - 9783642121159
VL - 6008 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 525
EP - 536
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
T2 - 11th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2010
Y2 - 21 March 2010 through 27 March 2010
ER -