TY - GEN
T1 - Reliability verification of search engines' hit counts
T2 - 10th International Conference on Web Engineering, ICWE 2010
AU - Funahashi, Takuya
AU - Yamana, Hayato
N1 - Funding Information:
The authors are grateful for the financial support by the Grant-in-Aid for Scientific Research from the Ministry of Education, Science, Sports and Culture (No. 21300038). We would like to thank our anonymous reviewers who have provided helpful comments on the refinement.
PY - 2010
Y1 - 2010
N2 - In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.
AB - In this paper, we investigate the trustworthiness of search engines' hit counts, numbers returned as search result counts. Since many studies adopt search engines' hit counts to estimate the popularity of input queries, the reliability of hit counts is indispensable for archiving trustworthy studies. However, hit counts are unreliable because they change, when a user clicks the "Search" button more than once or clicks the "Next" button on the search results page, or when a user queries the same term on separate days. In this paper, we analyze the characteristics of hit count transition by gathering various types of hit counts over two months by using 10,000 queries. The results of our study show that the hit counts with the largest search offset just before search engines adjust their hit counts are the most reliable. Moreover, hit counts are the most reliable when they are consistent over approximately a week.
KW - hit count
KW - information retrieval
KW - reliability
KW - search engine
KW - trustworthiness
UR - http://www.scopus.com/inward/record.url?scp=78649832936&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78649832936&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-16985-4_11
DO - 10.1007/978-3-642-16985-4_11
M3 - Conference contribution
AN - SCOPUS:78649832936
SN - 3642169848
SN - 9783642169847
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 114
EP - 125
BT - Current Trends in Web Engineering - 10th International Conference on Web Engineering, ICWE 2010 Workshops, Revised Selected Papers
Y2 - 5 July 2010 through 9 July 2010
ER -