TY - GEN
T1 - Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding
AU - Nakamura, Satoshi
AU - Hiyane, Kazuo
AU - Asano, Futoshi
AU - Kaneda, Yutaka
AU - Yamada, Takeshi
AU - Nishiura, Takanobu
AU - Kobayashi, Tetsunor
AU - Ise, Shiro
AU - Saruwatari, Hiroshi
N1 - Publisher Copyright:
© 2002 IEEE.
PY - 2002
Y1 - 2002
N2 - The sound data for open evaluation is necessary for studies such as sound source localization, sound retrieval, sound recognition and hands-free speech recognition in real acoustic environments. This paper reports on our project for acoustic data collection. There are many kinds of sound scenes in real environments. The sound scene is specified by sound sources and room acoustics. The number of combinations of the sound sources, source positions and rooms is huge in real acoustic environments. We assumed that the sound in the environments can be simulated by convolution of the isolated sound sources and impulse responses. As an isolated sound source, hundred kinds of environment sounds and speech sounds are collected. The impulse responses are collected in various acoustic environments. Additionally we collected sounds from a moving source. In this paper, progress of our sound scene database collection project and application to environment sound recognition and hands-free speech recognition are described.
AB - The sound data for open evaluation is necessary for studies such as sound source localization, sound retrieval, sound recognition and hands-free speech recognition in real acoustic environments. This paper reports on our project for acoustic data collection. There are many kinds of sound scenes in real environments. The sound scene is specified by sound sources and room acoustics. The number of combinations of the sound sources, source positions and rooms is huge in real acoustic environments. We assumed that the sound in the environments can be simulated by convolution of the isolated sound sources and impulse responses. As an isolated sound source, hundred kinds of environment sounds and speech sounds are collected. The impulse responses are collected in various acoustic environments. Additionally we collected sounds from a moving source. In this paper, progress of our sound scene database collection project and application to environment sound recognition and hands-free speech recognition are described.
UR - http://www.scopus.com/inward/record.url?scp=84872976579&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84872976579&partnerID=8YFLogxK
U2 - 10.1109/ICME.2002.1035537
DO - 10.1109/ICME.2002.1035537
M3 - Conference contribution
AN - SCOPUS:84872976579
T3 - Proceedings - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002
SP - 161
EP - 164
BT - Proceedings - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002
Y2 - 26 August 2002 through 29 August 2002
ER -