TY - GEN
T1 - Revision graph extraction in wikipedia based on supergram decomposition
AU - Wu, Jianmin
AU - Iwaihara, Mizuho
PY - 2013/11/28
Y1 - 2013/11/28
N2 - As one of the popular social media that many people turn to in recent years, collaborative encyclopedia Wikipedia provides information in a more "Neutral Point of View" way than others. Towards this core principle, plenty of efforts have been put into collaborative contribution and editing. The trajectories of how such collaboration appears by revisions are valuable for group dynamics and social media research, which suggest that we should extract the underlying derivation relationships among revisions from chronologically-sorted revision history in a precise way. In this paper, we propose a revision graph extraction method based on supergram decomposition in the document collection of near-duplicates. The plain text of revisions would be measured by its frequency distribution of supergram, which is the variable-length token sequence that keeps the same through revisions. We show that this method can effectively perform the task than existing methods. Categories and Subject Descriptors K.4.3 [Computers and Society]: Organizational Impacts - Computer-supported collaborative work. General Terms Algorithms, Experimentation.
AB - As one of the popular social media that many people turn to in recent years, collaborative encyclopedia Wikipedia provides information in a more "Neutral Point of View" way than others. Towards this core principle, plenty of efforts have been put into collaborative contribution and editing. The trajectories of how such collaboration appears by revisions are valuable for group dynamics and social media research, which suggest that we should extract the underlying derivation relationships among revisions from chronologically-sorted revision history in a precise way. In this paper, we propose a revision graph extraction method based on supergram decomposition in the document collection of near-duplicates. The plain text of revisions would be measured by its frequency distribution of supergram, which is the variable-length token sequence that keeps the same through revisions. We show that this method can effectively perform the task than existing methods. Categories and Subject Descriptors K.4.3 [Computers and Society]: Organizational Impacts - Computer-supported collaborative work. General Terms Algorithms, Experimentation.
KW - Collaboration
KW - Revision history
KW - Wikipedia
UR - http://www.scopus.com/inward/record.url?scp=84888148254&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84888148254&partnerID=8YFLogxK
U2 - 10.1145/2491055.2491065
DO - 10.1145/2491055.2491065
M3 - Conference contribution
AN - SCOPUS:84888148254
SN - 9781450318525
T3 - Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013
BT - Proceedings of the 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013
T2 - 9th International Symposium on Open Collaboration, WikiSym + OpenSym 2013
Y2 - 5 August 2013 through 7 August 2013
ER -