Finding co-occurring topics in wikipedia article segments

Renzhi Wang*, Jianmin Wu, Mizuho Iwaihara

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Identical topics in different articles indicate that the articles are related to each other about topics. Finding such co-occurring topics is useful to improve the accuracy of querying and clustering, and also to contrast related articles. Existing topic alignment work and topic relevance detection are based on term occurrence. In our research, we discuss incorporating latent topics existing in article segments by utilizing Latent Dirichlet Allocation (LDA), to detect topic relevance. We also study how segment proximities, arising from segment ordering and hyperlinks, shall be incorporated into topic detection and alignment. Experimental data show our method can find and distinguish three types of co-occurrence.

Original languageEnglish
Title of host publicationThe Emergence of Digital Libraries - Research and Practices - 16th International Conference on Asia-Pacific Digital Libraries, ICADL 2014, Proceedings
EditorsAdam Jatowt, Edie Rasmussen, Kulthida Tuamsuk
PublisherSpringer Verlag
Pages252-259
Number of pages8
ISBN (Electronic)9783319128221
Publication statusPublished - 2014 Jan 1
Event16th International Conference on Asia-Pacific Digital Libraries, ICADL 2014 - Chiang Mai, Thailand
Duration: 2014 Nov 52014 Nov 7

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8839
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Asia-Pacific Digital Libraries, ICADL 2014
Country/TerritoryThailand
CityChiang Mai
Period14/11/514/11/7

Keywords

  • LDA
  • Link
  • MLE
  • Wikipedia

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Finding co-occurring topics in wikipedia article segments'. Together they form a unique fingerprint.

Cite this