Linear multichannel blind source separation based on time-frequency mask obtained by harmonic/percussive sound separation

Soichiro Oyabu, Daichi Kitamura, Kohei Yatabe

Research output: Contribution to journalConference articlepeer-review

3 Citations (Scopus)


Determined blind source separation (BSS) extracts the source signals by linear multichannel filtering. Its performance depends on the accuracy of source modeling, and hence existing BSS methods have proposed several source models. Recently, a new determined BSS algorithm that incorporates a time-frequency mask has been proposed. It enables very flexible source modeling because the model is implicitly defined by a mask-generating function. Building up on this framework, in this paper, we propose a unification of determined BSS and harmonic/percussive sound separation (HPSS). HPSS is an important preprocessing for musical applications. By incorporating HPSS, both harmonic and percussive instruments can be accurately modeled for determined BSS. The resultant algorithm estimates the demixing filter using the information obtained by an HPSS method. We also propose a stabilization method that is essential for the proposed algorithm. Our experiments showed that the proposed method outperformed both HPSS and determined BSS methods including independent low-rank matrix analysis.

Original languageEnglish
Pages (from-to)201-205
Number of pages5
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Publication statusPublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: 2021 Jun 62021 Jun 11


  • Determined blind source separation (BSS)
  • Harmonic/percussive sound separation (HPSS)
  • Mask stabilization
  • Plug-and-play scheme
  • Time-frequency masking

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering


Dive into the research topics of 'Linear multichannel blind source separation based on time-frequency mask obtained by harmonic/percussive sound separation'. Together they form a unique fingerprint.

Cite this