TY - GEN
T1 - Grounding in social media
T2 - NAACL 2022 Student Research Workshop, SRW 2022, at 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
AU - Choudhary, Ritvik
AU - Kawahara, Daisuke
N1 - Funding Information:
This work was supported by a joint research grant from LINE Corporation.
Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Building open-domain dialogue systems capable of rich human-like conversational ability is one of the fundamental challenges in language generation. However, even with recent advancements in the field, existing open-domain generative models fail to capture and utilize external knowledge, leading to repetitive or generic responses to unseen utterances. Current work on knowledge-grounded dialogue generation primarily focuses on persona incorporation or searching a fact-based structured knowledge source such as Wikipedia. Our method takes a broader and simpler approach, which aims to improve the raw conversation ability of the system by mimicking the human response behavior through casual interactions found on social media. Utilizing a joint retriever-generator setup, the model queries a large set of filtered comment data from Reddit to act as additional context for the seq2seq generator. Automatic and human evaluations on open-domain dialogue datasets demonstrate the effectiveness of our approach.
AB - Building open-domain dialogue systems capable of rich human-like conversational ability is one of the fundamental challenges in language generation. However, even with recent advancements in the field, existing open-domain generative models fail to capture and utilize external knowledge, leading to repetitive or generic responses to unseen utterances. Current work on knowledge-grounded dialogue generation primarily focuses on persona incorporation or searching a fact-based structured knowledge source such as Wikipedia. Our method takes a broader and simpler approach, which aims to improve the raw conversation ability of the system by mimicking the human response behavior through casual interactions found on social media. Utilizing a joint retriever-generator setup, the model queries a large set of filtered comment data from Reddit to act as additional context for the seq2seq generator. Automatic and human evaluations on open-domain dialogue datasets demonstrate the effectiveness of our approach.
UR - http://www.scopus.com/inward/record.url?scp=85137592161&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137592161&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85137592161
T3 - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop
SP - 9
EP - 15
BT - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
Y2 - 10 July 2022 through 15 July 2022
ER -