The effect of score standardisation on topic set size design

Tetsuya Sakai*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)


Given a topic-by-run score matrix from past data, topic set size design methods can help test collection builders determine the number of topics to create for a new test collection from a statistical viewpoint. In this study, we apply a recently-proposed score standardisation method called std-AB to score matrices before applying topic set size design, and demonstrate its advantages. For topic set size design, std-AB suppresses score variances and thereby enables test collection builders to consider realistic choices of topic set sizes, and to handle unnormalised measures in the same way as normalised measures. In addition, even discrete measures that clearly violate normality assumptions look more continuous after applying std-AB, which may make them more suitable for statistically motivated topic set size design. Our experiments cover a variety of tasks and evaluation measures from NTCIR-12.

Original languageEnglish
Title of host publicationInformation Retrieval Technology - 12th Asia Information Retrieval Societies Conference, AIRS 2016, Proceedings
EditorsYi Chang, Ji-Rong Wen, Zhicheng Dou, Xin Zhao, Shaoping Ma, Yiqun Liu, Min Zhang
PublisherSpringer Verlag
Number of pages13
ISBN (Print)9783319480503
Publication statusPublished - 2016
Event12th Asia Information Retrieval Societies Conference, AIRS 2016 - Beijing, China
Duration: 2016 Nov 302016 Dec 2

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9994 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other12th Asia Information Retrieval Societies Conference, AIRS 2016

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'The effect of score standardisation on topic set size design'. Together they form a unique fingerprint.

Cite this