Unsupervised Answer Retrieval with Data Fusion for Community Question Answering

Sosuke Kato*, Toru Shimizu, Sumio Fujita, Tetsuya Sakai

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)


Community question answering (cQA) systems have enjoyed the benefits of advances in neural information retrieval, some models of which need annotated documents as supervised data. However, in contrast with the amount of supervised data for cQA systems, user-generated data in cQA sites have been increasing greatly with time. Thus, focusing on unsupervised models, we tackle a task of retrieving relevant answers for new questions from existing cQA data and propose two frameworks to exploit a Question Retrieval (QR) model for Answer Retrieval (AR). The first framework ranks answers according to the combined scores of QR and AR models and the second framework ranks answers using the scores of a QR model and best answer flags. In our experiments, we applied the combination of our proposed frameworks and a classical fusion technique to AR models with a Japanese cQA data set containing approximately 9.4M question-answer pairs. When best answer flags in the cQA data cannot be utilized, our combination of AR and QR scores with data fusion outperforms a base AR model on average. When best answer flags can be utilized, the retrieval performance can be improved further. While our results lack statistical significance, we discuss effect sizes as well as future sample sizes to attain sufficient statistical power.

Original languageEnglish
Title of host publicationInformation Retrieval Technology - 15th Asia Information Retrieval Societies Conference, AIRS 2019, Proceedings
EditorsFu Lee Wang, Haoran Xie, Wai Lam, Aixin Sun, Lun-Wei Ku, Tianyong Hao, Wei Chen, Tak-Lam Wong, Xiaohui Tao
Number of pages12
ISBN (Print)9783030428341
Publication statusPublished - 2020
Event15th Asia Information Retrieval Societies Conference, AIRS 2019 - Kowloon, Hong Kong
Duration: 2019 Nov 72019 Nov 9

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12004 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference15th Asia Information Retrieval Societies Conference, AIRS 2019
Country/TerritoryHong Kong


  • Answer Retrieval
  • Community question answering
  • Data fusion
  • Question Retrieval
  • Unsupervised model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Unsupervised Answer Retrieval with Data Fusion for Community Question Answering'. Together they form a unique fingerprint.

Cite this