TY - GEN
T1 - Multi-modal service operation estimation using DNN-based acoustic bag-of-features
AU - Tamura, Satoshi
AU - Uno, Takuya
AU - Takehara, Masanori
AU - Hayamizu, Satoru
AU - Kurata, Takeshi
N1 - Publisher Copyright:
© 2015 EURASIP.
PY - 2015/12/22
Y1 - 2015/12/22
N2 - In service engineering it is important to estimate when and what a worker did, because they include crucial evidences to improve service quality and working environments. For Service Operation Estimation (SOE), acoustic information is one of useful and key modalities; particularly environmental or background sounds include effective cues. This paper focuses on two aspects: (1) extracting powerful and robust acoustic features by using stacked-denoising-autoencoder and bag-of-feature techniques, and (2) investigating a multi-modal SOE scheme by combining the audio features and the other sensor data as well as non-sensor information. We conducted evaluation experiments using multi-modal data recorded in a restaurant. We improved SOE performance in comparison to conventional acoustic features, and effectiveness of our multimodal SOE scheme is also clarified.
AB - In service engineering it is important to estimate when and what a worker did, because they include crucial evidences to improve service quality and working environments. For Service Operation Estimation (SOE), acoustic information is one of useful and key modalities; particularly environmental or background sounds include effective cues. This paper focuses on two aspects: (1) extracting powerful and robust acoustic features by using stacked-denoising-autoencoder and bag-of-feature techniques, and (2) investigating a multi-modal SOE scheme by combining the audio features and the other sensor data as well as non-sensor information. We conducted evaluation experiments using multi-modal data recorded in a restaurant. We improved SOE performance in comparison to conventional acoustic features, and effectiveness of our multimodal SOE scheme is also clarified.
KW - bag of features
KW - environmental sounds
KW - multimodal signal processing
KW - Service operation estimation
KW - stacked denoising autoencoder
UR - http://www.scopus.com/inward/record.url?scp=84963949107&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963949107&partnerID=8YFLogxK
U2 - 10.1109/EUSIPCO.2015.7362793
DO - 10.1109/EUSIPCO.2015.7362793
M3 - Conference contribution
AN - SCOPUS:84963949107
T3 - 2015 23rd European Signal Processing Conference, EUSIPCO 2015
SP - 2291
EP - 2295
BT - 2015 23rd European Signal Processing Conference, EUSIPCO 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd European Signal Processing Conference, EUSIPCO 2015
Y2 - 31 August 2015 through 4 September 2015
ER -