TY - GEN
T1 - Metrics, statistics, tests
AU - Sakai, Tetsuya
PY - 2014/1/1
Y1 - 2014/1/1
N2 - This lecture is intended to serve as an introduction to Information Retrieval (IR) effectiveness metrics and their usage in IR experiments using test collections. Evaluation metrics are important because they are inexpensive tools for monitoring technological advances. This lecture covers a wide variety of IR metrics (except for those designed for XML retrieval, as there is a separature lecture dedicated to this topic) and discusses some methods for evaluating evaluation metrics. It also briefly covers computer-based statistical significance testing. The takeaways for IR experimenters are: (1) It is important to understand the properties of IR metrics and choose or design appropriate ones for the task at hand; (2) Computer-based statistical significance tests are simple and useful, although statistical significance does not necessarily imply practical significance, and statistical insignificance does not necessarily imply practical insignificance; and (3) Several methods exist for discussing which metrics are "good," although none of them is perfect.
AB - This lecture is intended to serve as an introduction to Information Retrieval (IR) effectiveness metrics and their usage in IR experiments using test collections. Evaluation metrics are important because they are inexpensive tools for monitoring technological advances. This lecture covers a wide variety of IR metrics (except for those designed for XML retrieval, as there is a separature lecture dedicated to this topic) and discusses some methods for evaluating evaluation metrics. It also briefly covers computer-based statistical significance testing. The takeaways for IR experimenters are: (1) It is important to understand the properties of IR metrics and choose or design appropriate ones for the task at hand; (2) Computer-based statistical significance tests are simple and useful, although statistical significance does not necessarily imply practical significance, and statistical insignificance does not necessarily imply practical insignificance; and (3) Several methods exist for discussing which metrics are "good," although none of them is perfect.
UR - http://www.scopus.com/inward/record.url?scp=84901306933&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901306933&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-54798-0_6
DO - 10.1007/978-3-642-54798-0_6
M3 - Conference contribution
AN - SCOPUS:84901306933
SN - 9783642547973
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 116
EP - 163
BT - Bridging Between Information Retrieval and Databases - PROMISE Winter School 2013, Revised Tutorial Lectures
PB - Springer Verlag
T2 - 2013 PROMISE Winter School: Bridging Between Information Retrieval and Databases
Y2 - 4 February 2013 through 8 February 2013
ER -