F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search

Hiromasa Fujihara*, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

*この研究の対応する著者

研究成果: Conference contribution

30 被引用数 (Scopus)

抄録

This paper describes a method for estimating F0s of vocal from polyphonic audio signals. Because melody is sung by a singer in many musical pieces, the estimation of F0s of the vocal part is useful for many applications. Based on existing multiple-F0 estimation method, we evaluate the vocal probabilities of the harmonic structure of each F0 candidate. In order to calculate the vocal probabilities of the harmonic structure, we extract and resynthesize the harmonic structure by using a sinusoidal model and extract feature vectors. Then, we evaluate the vocal probability by using vocal and non-vocal Gaussian mixture models (GMMs). Finally, we track F0 trajectories using these probabilities based on Viterbi search. Experimental results show that our method improves estimation accuracy from 78.1% to 84.3%, which is 28.3% reduction of misestimation.

本文言語English
ホスト出版物のタイトル2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings
ページV253-V256
出版ステータスPublished - 2006 12月 1
外部発表はい
イベント2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 - Toulouse, France
継続期間: 2006 5月 142006 5月 19

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
5
ISSN(印刷版)1520-6149

Conference

Conference2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006
国/地域France
CityToulouse
Period06/5/1406/5/19

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル