Abstract
We exploit the barge-in rate of individual users to predict automatic speech recognition (ASR) errors. A barge-in is a situation in which a user starts speaking during a system prompt, and it can be detected even when ASR results are not reliable. Such features not using ASR results can be a clue for managing a situation in which user utterances cannot be successfully recognized. Since individual users in our system can be identified by their phone numbers, we accumulate how often each user barges in and use this rate as a user profile for determining whether a current "barge-in" utterance should be accepted or not. We furthermore set a window that reflects the temporal transition of the user's behavior as they get accustomed to the system. Experimental results show that setting the window improves the prediction accuracy of whether the utterance should be accepted or not. The experiments also clarify the minimum window width for improving accuracy.
Original language | English |
---|---|
Title of host publication | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Pages | 183-186 |
Number of pages | 4 |
Publication status | Published - 2008 |
Externally published | Yes |
Event | INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia Duration: 2008 Sept 22 → 2008 Sept 26 |
Other
Other | INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association |
---|---|
Country/Territory | Australia |
City | Brisbane, QLD |
Period | 08/9/22 → 08/9/26 |
Keywords
- Barge-in
- Spoken dialogue system
- User modeling
ASJC Scopus subject areas
- Human-Computer Interaction
- Signal Processing
- Software
- Sensory Systems