Relational duality: Unsupervised extraction of semantic relations between entities on the web

Danushka Tarupathi Bollegala, Yutaka Matsuo, Mitsuru Ishizuka

研究成果: Conference contribution

73 被引用数 (Scopus)


Extracting semantic relations among entities is an important first step in various tasks in Web mining and natural language processing such as information extraction, relation detection, and social network mining. A relation can be expressed extensionally by stating all the instances of that relation or intensionally by defining all the paraphrases of that relation. For example, consider the ACQUISITION relation between two companies. An extensional definition of ACQUISITION contains all pairs of companies in which one company is acquired by another (e.g. (YouTube, Google) or (Powerset, Microsoft)). On the other hand we can intensionally define ACQUISITION as the relation described by lexical patterns such as X is acquired by Y, or Y purchased X, where X and Y denote two companies. We use this dual representation of semantic relations to propose a novel sequential co-clustering algorithm that can extract numerous relations efficiently from unlabeled data. We provide an efficient heuristic to find the parameters of the proposed coclustering algorithm. Using the clusters produced by the algorithm, we train an L1 regularized logistic regression model to identify the representative patterns that describe the relation expressed by each cluster. We evaluate the proposed method in three different tasks: measuring relational similarity between entity pairs, open information extraction (Open IE), and classifying relations in a social network system. Experiments conducted using a benchmark dataset show that the proposed method improves existing relational similarity measures. Moreover, the proposed method significantly outperforms the current state-of-the-art Open IE systems in terms of both precision and recall. The proposed method correctly classifies 53 relation types in an online social network containing 470; 671 nodes and 35; 652; 475 edges, thereby demonstrating its efficacy in real-world relation detection tasks.

ホスト出版物のタイトルProceedings of the 19th International Conference on World Wide Web, WWW '10
出版ステータスPublished - 2010
イベント19th International World Wide Web Conference, WWW2010 - Raleigh, NC
継続期間: 2010 4月 262010 4月 30


Other19th International World Wide Web Conference, WWW2010
CityRaleigh, NC

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用


「Relational duality: Unsupervised extraction of semantic relations between entities on the web」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。