SogouQ: The First Large-Scale Test Collection with Click Streams Used in a Shared-Task Evaluation

Ruihua Song*, Min Zhang, Cheng Luo, Tetsuya Sakai, Yiqun Liu, Zhicheng Dou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Search logs are very precious for information retrieval studies. In this chapter, we will introduce a real Chinese query log dataset, SogouQ, which was released by SogouQ corporation in 2010 for the NTCIR-9 Intent task. SogouQ contains more than 30 million clicks collected in 2008. It is the first large-scale query logs used in a shared-task evaluation (i.e., the NTCIR tasks). SogouQ has been adopted in a number of follow-up evaluation tasks, NTCIR-10 Intent-2, NTCIR-11 IMine, NTCIR-12 IMine-2, as well as in several Chinese domestic tasks. Moreover, SogouQ has a broader impact on other research areas, such as natural language processing and social science. It has been acquired by more than 200 institutions.

Original languageEnglish
Title of host publicationInformation Retrieval Series
PublisherSpringer Nature
Pages143-150
Number of pages8
DOIs
Publication statusPublished - 2021

Publication series

NameInformation Retrieval Series
Volume43
ISSN (Print)1871-7500

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'SogouQ: The First Large-Scale Test Collection with Click Streams Used in a Shared-Task Evaluation'. Together they form a unique fingerprint.

Cite this