TY - GEN
T1 - Extracting the author of web pages
AU - Kato, Yoshikiyo
AU - Kawahara, Daisuke
AU - Inui, Kentaro
AU - Kurohashi, Sadao
AU - Shibata, Tomohide
PY - 2008
Y1 - 2008
N2 - In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.
AB - In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.
KW - Algorithms
KW - Experimentation
UR - http://www.scopus.com/inward/record.url?scp=70349231362&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349231362&partnerID=8YFLogxK
U2 - 10.1145/1458527.1458537
DO - 10.1145/1458527.1458537
M3 - Conference contribution
AN - SCOPUS:70349231362
SN - 9781605582597
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 35
EP - 41
BT - Proceedings of the 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08
T2 - 2nd ACM Workshop on Information Credibility on the Web, WICOW'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08
Y2 - 26 October 2008 through 30 October 2008
ER -