TY - GEN

T1 - Classifying community QA questions that contain an image

AU - Tamaki, Kenta

AU - Togashi, Riku

AU - Kato, Sosuke

AU - Fujita, Sumio

AU - Maeda, Hideyuki

AU - Sakai, Tetsuya

N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.

PY - 2018/9/10

Y1 - 2018/9/10

N2 - We consider the problem of automatically assigning a category to a given question posted to a Community Question Answering (CQA) site, where the question contains not only text but also an image. For example, CQA users may post a photograph of a dress and ask the community "Is this appropriate for a wedding?" where the appropriate category for this question might be "Manners, Ceremonial occasions." We tackle this problem using Convolutional Neural Networks with a DualNet architecture for combining the image and text representations. Our experiments with real data from Yahoo Chiebukuro and crowdsourced gold-standard categories show that the DualNet approach outperforms a text-only baseline (p = .0000), a sum-and-product baseline (p = .0000), Multimodal Compact Bilinear pooling (p = .0000), and a combination of sum-and-product and MCB (p = .0000), where the p-values are based on a randomised Tukey Honestly Significant Difference test with B = 5000 trials.

AB - We consider the problem of automatically assigning a category to a given question posted to a Community Question Answering (CQA) site, where the question contains not only text but also an image. For example, CQA users may post a photograph of a dress and ask the community "Is this appropriate for a wedding?" where the appropriate category for this question might be "Manners, Ceremonial occasions." We tackle this problem using Convolutional Neural Networks with a DualNet architecture for combining the image and text representations. Our experiments with real data from Yahoo Chiebukuro and crowdsourced gold-standard categories show that the DualNet approach outperforms a text-only baseline (p = .0000), a sum-and-product baseline (p = .0000), Multimodal Compact Bilinear pooling (p = .0000), and a combination of sum-and-product and MCB (p = .0000), where the p-values are based on a randomised Tukey Honestly Significant Difference test with B = 5000 trials.

KW - community question answering

KW - convolutional neural networks

KW - question categorisation

UR - http://www.scopus.com/inward/record.url?scp=85063515836&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063515836&partnerID=8YFLogxK

U2 - 10.1145/3234944.3234948

DO - 10.1145/3234944.3234948

M3 - Conference contribution

AN - SCOPUS:85063515836

T3 - ICTIR 2018 - Proceedings of the 2018 ACM SIGIR International Conference on the Theory of Information Retrieval

SP - 219

EP - 222

BT - ICTIR 2018 - Proceedings of the 2018 ACM SIGIR International Conference on the Theory of Information Retrieval

PB - Association for Computing Machinery, Inc

T2 - 8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2018

Y2 - 14 September 2018 through 17 September 2018

ER -