TY - JOUR
T1 - Dictation of Multiparty Conversation Considering Speaker Individuality and Turn Taking
AU - Murai, Noriyuki
AU - Kobayashi, Tetsunori
PY - 2003/11/30
Y1 - 2003/11/30
N2 - This paper discusses an algorithm that recognizes multiparty speech with complex turn taking. In recognition of the conversation of multiple speakers, it is necessary to know not only what is spoken, as in the conventional system, but also who spoke up to what point. The purpose of this paper is to find a method to solve this problem. The representation of the likelihood of turn taking is included in the language model in the continuous speech recognition system, and the speech properties of each speaker are represented by a statistical model. Using this approach, two algorithms are proposed that estimate simultaneously and in parallel the speaker and the speech content. Recognition experiments using conversation in TV sports news show that the proposed method can correct a maximum of 29.5% of the errors in the recognition of speech content and 93.0% of the errors in recognition of the speaker.
AB - This paper discusses an algorithm that recognizes multiparty speech with complex turn taking. In recognition of the conversation of multiple speakers, it is necessary to know not only what is spoken, as in the conventional system, but also who spoke up to what point. The purpose of this paper is to find a method to solve this problem. The representation of the likelihood of turn taking is included in the language model in the continuous speech recognition system, and the speech properties of each speaker are represented by a statistical model. Using this approach, two algorithms are proposed that estimate simultaneously and in parallel the speaker and the speech content. Recognition experiments using conversation in TV sports news show that the proposed method can correct a maximum of 29.5% of the errors in the recognition of speech content and 93.0% of the errors in recognition of the speaker.
KW - GMM
KW - MLLR
KW - Multiparty conversation
KW - Speaker individuality
KW - Statistical turn taking model
UR - http://www.scopus.com/inward/record.url?scp=0242364146&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0242364146&partnerID=8YFLogxK
U2 - 10.1002/scj.1223
DO - 10.1002/scj.1223
M3 - Article
AN - SCOPUS:0242364146
SN - 0882-1666
VL - 34
SP - 103
EP - 111
JO - Systems and Computers in Japan
JF - Systems and Computers in Japan
IS - 13
ER -