TY - JOUR
T1 - On-line sound event localization and detection for real-time recognition of surrounding environment
AU - Nagatomo, Kento
AU - Yasuda, Masahiro
AU - Yatabe, Kohei
AU - Saito, Shoichiro
AU - Oikawa, Yasuhiro
N1 - Publisher Copyright:
© 2022 The Authors
PY - 2022/10
Y1 - 2022/10
N2 - Sound Event Localization and Detection (SELD) is a task of simultaneously identifying sound events and their locations. The existing methods perform SELD in the off-line setting using deep neural networks (DNNs) including bi-directional recurrent neural network (BiRNN). Although their effectiveness has been shown in the literature, they cannot be directly applied to real-time applications which requires on-line execution of SELD, i.e., the input signals must be successively processed with small latency. In this paper, we propose on-line extension of the off-line SELD systems and discuss about the essential latency of an on-line SELD system. The relationship between the system latency and accuracy of SELD was investigated by experiments. From the experimental results, we confirmed that on-line extension of the SELD system maintains or improves the performance of localization, while event detection performance is degraded in low-latency.
AB - Sound Event Localization and Detection (SELD) is a task of simultaneously identifying sound events and their locations. The existing methods perform SELD in the off-line setting using deep neural networks (DNNs) including bi-directional recurrent neural network (BiRNN). Although their effectiveness has been shown in the literature, they cannot be directly applied to real-time applications which requires on-line execution of SELD, i.e., the input signals must be successively processed with small latency. In this paper, we propose on-line extension of the off-line SELD systems and discuss about the essential latency of an on-line SELD system. The relationship between the system latency and accuracy of SELD was investigated by experiments. From the experimental results, we confirmed that on-line extension of the SELD system maintains or improves the performance of localization, while event detection performance is degraded in low-latency.
KW - Direction of arrival (DOA)
KW - Short-time Fourier transform (STFT)
KW - Sound event detection (SED)
KW - Sound event localization (SEL)
KW - Sound event localization and detection (SELD)
KW - System latency
KW - Window function
UR - http://www.scopus.com/inward/record.url?scp=85137175321&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137175321&partnerID=8YFLogxK
U2 - 10.1016/j.apacoust.2022.108961
DO - 10.1016/j.apacoust.2022.108961
M3 - Article
AN - SCOPUS:85137175321
SN - 0003-682X
VL - 199
JO - Applied Acoustics
JF - Applied Acoustics
M1 - 108961
ER -