TY - GEN
T1 - Wikipedia revision graph extraction based on n-gram cover
AU - Wu, Jianmin
AU - Iwaihara, Mizuho
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - During the past decade, mass collaboration systems have emerged and thrived on the World-Wide Web, with numerous user contents generated. As one of such systems, Wikipedia allows users to add and edit articles in this encyclopedic knowledge base and piles of revisions have been contributed. Wikipedia maintains a linear record of edit history with timestamp for each article, which includes precious information on how each article has evolved. However, meaningful revision evolution features like branching and revert are implicit and needed to be reconstructed. Also, existence of merges from multiple ancestors indicates that the edit history shall be modeled as a directed acyclic graph. To address these issues, we propose a revision graph extraction method based on n-gram cover that effectively find branching and revert. We evaluate the accuracy of our method by comparing with manually constructed revision graphs.
AB - During the past decade, mass collaboration systems have emerged and thrived on the World-Wide Web, with numerous user contents generated. As one of such systems, Wikipedia allows users to add and edit articles in this encyclopedic knowledge base and piles of revisions have been contributed. Wikipedia maintains a linear record of edit history with timestamp for each article, which includes precious information on how each article has evolved. However, meaningful revision evolution features like branching and revert are implicit and needed to be reconstructed. Also, existence of merges from multiple ancestors indicates that the edit history shall be modeled as a directed acyclic graph. To address these issues, we propose a revision graph extraction method based on n-gram cover that effectively find branching and revert. We evaluate the accuracy of our method by comparing with manually constructed revision graphs.
KW - Mass collaboration
KW - Wikipedia revision graph
UR - http://www.scopus.com/inward/record.url?scp=84865646189&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84865646189&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33050-6_4
DO - 10.1007/978-3-642-33050-6_4
M3 - Conference contribution
AN - SCOPUS:84865646189
SN - 9783642330490
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 29
EP - 38
BT - Web-Age Information Management - WAIM 2012 International Workshops
T2 - Int. Workshops on Web-Age Information Management, WAIM 2012: 1st Int. Workshop on GDMM 2012, 2nd Int. Wireless Sensor Networks Workshop, IWSN 2012, 1st Int. Workshop on MDSP 2012, 3rd Int. Workshop on USDM 2012, 4th Int. Workshop on XMLDM 2012
Y2 - 18 August 2012 through 20 August 2012
ER -