Predicting listener back-channels for human-agent interaction using neuro-dynamical model

Shotaro Sano*, Shun Nishide, Hiroshi G. Okuno, Tetsuya Ogata

*この研究の対応する著者

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

The goal of our work is to create natural verbal interaction between humans and speech dialogue agents. In this paper, we focus on generations of back-channel for speech dialogue agents the same way humans do. To create such a system, the system needs to predict the appropriate timing of back-channel on the basis of the human's speech. For the prediction model, we use a neuro-dynamical system called a multiple timescale recurrent neural network (MTRNN). The model is trained using an actual corpus of a poster session of the IMADE project using the presenter's prosodic and visual information as features. Using the model, we conducted back-channel timing prediction experiments. The results showed that our system could predict back-channel timing about 0.5 seconds before generation of back-channel response. Comparing the results with the actual back-channel timing in the corpus, the system showed 37.1% of recall, 31.7% of precision, and 34.2% of F-measure. These results show the model to effectively predict and generate back-channel responses.

本文言語English
ホスト出版物のタイトル2011 IEEE/SICE International Symposium on System Integration, SII 2011
ページ18-23
ページ数6
DOI
出版ステータスPublished - 2011
外部発表はい
イベント2011 IEEE/SICE International Symposium on System Integration, SII 2011 - Kyoto, Japan
継続期間: 2011 12月 202011 12月 22

出版物シリーズ

名前2011 IEEE/SICE International Symposium on System Integration, SII 2011

Conference

Conference2011 IEEE/SICE International Symposium on System Integration, SII 2011
国/地域Japan
CityKyoto
Period11/12/2011/12/22

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • 制御およびシステム工学

フィンガープリント

「Predicting listener back-channels for human-agent interaction using neuro-dynamical model」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル