TY - JOUR
T1 - Recognition of intentions of users' short responses for conversational news delivery system
AU - Takatsu, Hiroaki
AU - Yokoyama, Katsuya
AU - Matsuyama, Yoichi
AU - Honda, Hiroshi
AU - Fujie, Shinya
AU - Kobayashi, Tetsunori
N1 - Publisher Copyright:
Copyright © 2019 ISCA
PY - 2019
Y1 - 2019
N2 - In human-human conversations, listeners often convey intentions to their speakers through feedbacks comprising reflexive short responses. The speakers then recognize these intentions and dynamically change the conversational plans to transmit information more efficiently. For the design of spoken dialogue systems that deliver a massive amount of information, such as news, it is essential to accurately capture users' intentions from reflexive short responses to efficiently select or eliminate the information to be transmitted depending on the user's needs. However, such short responses from users are normally too short to recognize their actual intentions only from the prosodic and linguistic features of their short responses. In this paper, we propose a user's short-response intention-recognition model that accounts for the previous system's utterances as the context of the conversation in addition to prosodic and linguistic features of user's utterances. To achieve this, we define types of short response intentions in terms of effective information transmission and created new dataset by annotating over the interaction data collected using our spoken dialogue system. Our experimental results demonstrate that the classification accuracy can be improved using the linguistic features of the system's previous utterances encoded by Bidirectional Encoder Representations from Transformers (BERT) as the conversational context.
AB - In human-human conversations, listeners often convey intentions to their speakers through feedbacks comprising reflexive short responses. The speakers then recognize these intentions and dynamically change the conversational plans to transmit information more efficiently. For the design of spoken dialogue systems that deliver a massive amount of information, such as news, it is essential to accurately capture users' intentions from reflexive short responses to efficiently select or eliminate the information to be transmitted depending on the user's needs. However, such short responses from users are normally too short to recognize their actual intentions only from the prosodic and linguistic features of their short responses. In this paper, we propose a user's short-response intention-recognition model that accounts for the previous system's utterances as the context of the conversation in addition to prosodic and linguistic features of user's utterances. To achieve this, we define types of short response intentions in terms of effective information transmission and created new dataset by annotating over the interaction data collected using our spoken dialogue system. Our experimental results demonstrate that the classification accuracy can be improved using the linguistic features of the system's previous utterances encoded by Bidirectional Encoder Representations from Transformers (BERT) as the conversational context.
KW - Intention recognition
KW - Neural networks
KW - Spoken dialogue system
UR - http://www.scopus.com/inward/record.url?scp=85074685615&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074685615&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2019-2121
DO - 10.21437/Interspeech.2019-2121
M3 - Conference article
AN - SCOPUS:85074685615
SN - 2308-457X
VL - 2019-September
SP - 1193
EP - 1197
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019
Y2 - 15 September 2019 through 19 September 2019
ER -