TY - JOUR
T1 - Enabling a user to specify an item at any time during system enumeration - Item identification for barge-in-able conversational dialogue systems
AU - Matsuyama, Kyoko
AU - Komatani, Kazunori
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2009/11/26
Y1 - 2009/11/26
N2 - In conversational dialogue systems, users prefer to speak at any time and to use natural expressions. We have developed an Independent Component Analysis (ICA) based semi-blind source separation method, which allows users to barge-in over system utterances at any time. We created a novel method from timing information derived from barge-in utterances to identify one item that a user indicates during system enumeration. First, we determine the timing distribution of user utterances containing referential expressions and then approximate it using a gamma distribution. Second, we represent both the utterance timing and automatic speech recognition (ASR) results as probabilities of the desired selection from the system's enumeration. We then integrate these two probabilities to identify the item having the maximum likelihood of selection. Experimental results using 400 utterances indicated that our method outperformed two methods used as a baseline (one of ASR results only and one of utterance timing only) in identification accuracy.
AB - In conversational dialogue systems, users prefer to speak at any time and to use natural expressions. We have developed an Independent Component Analysis (ICA) based semi-blind source separation method, which allows users to barge-in over system utterances at any time. We created a novel method from timing information derived from barge-in utterances to identify one item that a user indicates during system enumeration. First, we determine the timing distribution of user utterances containing referential expressions and then approximate it using a gamma distribution. Second, we represent both the utterance timing and automatic speech recognition (ASR) results as probabilities of the desired selection from the system's enumeration. We then integrate these two probabilities to identify the item having the maximum likelihood of selection. Experimental results using 400 utterances indicated that our method outperformed two methods used as a baseline (one of ASR results only and one of utterance timing only) in identification accuracy.
KW - Barge-in
KW - Conversational interaction
KW - Spoken dialogue system
KW - Utterance timing
UR - http://www.scopus.com/inward/record.url?scp=70450162205&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70450162205&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:70450162205
SN - 2308-457X
SP - 252
EP - 255
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
Y2 - 6 September 2009 through 10 September 2009
ER -