Localizing bird songs using an open source robot audition system with a microphone array

Reiji Suzuki, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno

    Research output: Contribution to journalArticlepeer-review

    9 Citations (Scopus)


    Auditory scene analysis is critical in observing bio-diversity and understanding social behavior of animals in natural habitats be- cause many animals and birds sing or call and environmental sounds are made. To understand acoustic interactions among songbirds, we need to collect spatiotemporal data for a long pe- riod of time during which multiple individuals and species are singing simultaneously. We are developing HARKBird, which is an easily-available and portable system to record, localize, and analyze bird songs. It is composed of a laptop PC with an open source robot audition system HARK (Honda Research In- stitute Japan Audition for Robots with Kyoto University) and a commercially available low-cost microphone array. HARK- Bird helps us annotate bird songs and grasp the soundscape around the microphone array by providing the direction of ar- rival (DOA) of each localized source and its separated sound automatically. In this paper, we briefly introduce our system and show an example analysis of a track recorded at the ex- perimental forest of Nagoya University, in central Japan. We demonstrate that HARKBird can extract birdsongs successfully by combining multiple localization results with appropriate pa- rameter settings that took account of ecological properties of environment around a microphone array and species-specific properties of bird songs.

    Original languageEnglish
    Pages (from-to)2626-2630
    Number of pages5
    JournalUnknown Journal
    Publication statusPublished - 2016

    ASJC Scopus subject areas

    • Language and Linguistics
    • Human-Computer Interaction
    • Signal Processing
    • Software
    • Modelling and Simulation


    Dive into the research topics of 'Localizing bird songs using an open source robot audition system with a microphone array'. Together they form a unique fingerprint.

    Cite this