TY - GEN
T1 - CENSREC-2-AV
T2 - 2012 15th International Conference on Speech Database and Assessments, Oriental COCOSDA 2012
AU - Ukai, Naoya
AU - Kawasaki, Takuya
AU - Tamura, Satoshi
AU - Hayamizu, Satoru
AU - Miyajima, Chiyomi
AU - Kitaoka, Norihide
AU - Takeda, Kazuya
PY - 2012
Y1 - 2012
N2 - In this paper, we introduce a bimodal speech recognition corpus in real environments. In recent years, speech recognition technology has been used in noisy conditions. Therefore, it becomes necessary to achieve higher recognition accuracy in real environments. As one of the solutions, bimodal speech recognition using audio and non-audio information is getting studied. However, there are few databases which can be used to evaluate the bimodal speech recognition in real environments. In this paper, we introduce CENSREC-2-AV we have been working to built, as a new bimodal speech recognition corpus. CENSREC-2-AV is one of the databases of the CEN-SREC project; we provided a similar corpus CENSREC-1-AV as a database for bimodal speech recognition for additive noises. In these corpora, there are speech data and lip images. Researchers can evaluate a bimodal speech recognition method built using CENSREC-1-AV which consists of clean data, in real environments by using CENSREC-2-AV.
AB - In this paper, we introduce a bimodal speech recognition corpus in real environments. In recent years, speech recognition technology has been used in noisy conditions. Therefore, it becomes necessary to achieve higher recognition accuracy in real environments. As one of the solutions, bimodal speech recognition using audio and non-audio information is getting studied. However, there are few databases which can be used to evaluate the bimodal speech recognition in real environments. In this paper, we introduce CENSREC-2-AV we have been working to built, as a new bimodal speech recognition corpus. CENSREC-2-AV is one of the databases of the CEN-SREC project; we provided a similar corpus CENSREC-1-AV as a database for bimodal speech recognition for additive noises. In these corpora, there are speech data and lip images. Researchers can evaluate a bimodal speech recognition method built using CENSREC-1-AV which consists of clean data, in real environments by using CENSREC-2-AV.
KW - CENSREC
KW - audio-visual speech corpus
KW - bimodal speech recognition
KW - real environment
UR - http://www.scopus.com/inward/record.url?scp=84874244954&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84874244954&partnerID=8YFLogxK
U2 - 10.1109/ICSDA.2012.6422476
DO - 10.1109/ICSDA.2012.6422476
M3 - Conference contribution
AN - SCOPUS:84874244954
SN - 9781467328104
T3 - Proceedings of the 2012 International Conference on Speech Database and Assessments, Oriental COCOSDA 2012
SP - 88
EP - 91
BT - Proceedings of the 2012 International Conference on Speech Database and Assessments, Oriental COCOSDA 2012
Y2 - 9 December 2012 through 12 December 2012
ER -