On-line sound event localization and detection for real-time recognition of surrounding environment

Kento Nagatomo*, Masahiro Yasuda, Kohei Yatabe, Shoichiro Saito, Yasuhiro Oikawa

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Sound Event Localization and Detection (SELD) is a task of simultaneously identifying sound events and their locations. The existing methods perform SELD in the off-line setting using deep neural networks (DNNs) including bi-directional recurrent neural network (BiRNN). Although their effectiveness has been shown in the literature, they cannot be directly applied to real-time applications which requires on-line execution of SELD, i.e., the input signals must be successively processed with small latency. In this paper, we propose on-line extension of the off-line SELD systems and discuss about the essential latency of an on-line SELD system. The relationship between the system latency and accuracy of SELD was investigated by experiments. From the experimental results, we confirmed that on-line extension of the SELD system maintains or improves the performance of localization, while event detection performance is degraded in low-latency.

Original languageEnglish
Article number108961
JournalApplied Acoustics
Volume199
DOIs
Publication statusPublished - 2022 Oct

Keywords

  • Direction of arrival (DOA)
  • Short-time Fourier transform (STFT)
  • Sound event detection (SED)
  • Sound event localization (SEL)
  • Sound event localization and detection (SELD)
  • System latency
  • Window function

ASJC Scopus subject areas

  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'On-line sound event localization and detection for real-time recognition of surrounding environment'. Together they form a unique fingerprint.

Cite this