Suggesting specific segments as link targets in Wikipedia

Renzhi Wang*, Mizuho Iwaihara

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Links within Wikipedia indicate that the two texts of a link origin and destination are related about their semantic topics. Existing link detection methods focus on article titles because most of links in Wikipedia point to article titles. But there are a number of links in Wikipedia pointing to corresponding segments, because the whole article is too general and it is hard for readers to obtain the intention of the link. We propose a method to automatically predict whether a link target is a specific segment and provide which segment is most relevant. We propose a combination method of Latent Dirichlet Allocation (LDA) and Maximum Likelihood Estimation (MLE) to represent every segment as a vector, then we obtain similarity of each segment pair, finally we utilize variance, standard deviation and other statistical features to predict the results. Through evaluations on Wikipedia articles, our method performs better result than existing methods.

Original languageEnglish
Title of host publicationDigital Libraries
Subtitle of host publicationKnowledge, Information, and Data in an Open Access Society - 18th International Conference on Asia-Pacific Digital Libraries, ICADL 2016, Proceedings
EditorsAtsuyuki Morishima, Andreas Rauber, Chern li Liew
PublisherSpringer Verlag
Pages394-405
Number of pages12
ISBN (Print)9783319493039
DOIs
Publication statusPublished - 2016 Jan 1
Event18th International Conference on Asia-Pacific Digital Libraries, ICADL 2016 - Tsukuba, Japan
Duration: 2016 Dec 72016 Dec 9

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10075 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other18th International Conference on Asia-Pacific Digital Libraries, ICADL 2016
Country/TerritoryJapan
CityTsukuba
Period16/12/716/12/9

Keywords

  • LDA
  • Link suggestion
  • Text mining
  • Wikipedia

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Suggesting specific segments as link targets in Wikipedia'. Together they form a unique fingerprint.

Cite this