Transfer Learning Using Musical Instrument Audio for Improving Automatic Singing Label Calibration

Xiao Fu, Xijian Rui, Hangyu Deng, Jinglu Hu*

*この研究の対応する著者

研究成果: Article査読

抄録

Automatic Singing Label Calibration (ASLC) aims to enhance the labeling accuracy of coarse singing labels through the analysis of raw audio. However, the ASLC model faces limitations due to the challenges and costs associated with generating or augmenting real-world songs. To address this problem, we propose a novel approach to strengthen limited singing audio using easily available musical instrument audio. Directly using the musical instrument audio as a data augmentation for the singing audio is unreliable due to the distinct differences between vocal and instrumental sounds. Therefore, we employ transfer learning, which allows relevant knowledge to be transferred from one domain to another. In the pre-training stage, the ASLC model learns to predict the accurate labels from the musical instrument audio. We then consider the vocal as a special musical instrument and fine-tune the pretrained ASLC model using a singing annotation data set. Experimental results demonstrate that our transfer learning-based approach outperforms the original ASLC model. By leveraging the readily available musical instrument audio, our method achieves improved performance in enhancing the labeling accuracy of singing audio.

本文言語English
ページ(範囲)707-715
ページ数9
ジャーナルIEEJ Transactions on Electrical and Electronic Engineering
19
5
DOI
出版ステータスPublished - 2024 5月

ASJC Scopus subject areas

  • 電子工学および電気工学

フィンガープリント

「Transfer Learning Using Musical Instrument Audio for Improving Automatic Singing Label Calibration」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル