Abstract
Automatic Singing Label Calibration (ASLC) aims to enhance the labeling accuracy of coarse singing labels through the analysis of raw audio. However, the ASLC model faces limitations due to the challenges and costs associated with generating or augmenting real-world songs. To address this problem, we propose a novel approach to strengthen limited singing audio using easily available musical instrument audio. Directly using the musical instrument audio as a data augmentation for the singing audio is unreliable due to the distinct differences between vocal and instrumental sounds. Therefore, we employ transfer learning, which allows relevant knowledge to be transferred from one domain to another. In the pre-training stage, the ASLC model learns to predict the accurate labels from the musical instrument audio. We then consider the vocal as a special musical instrument and fine-tune the pretrained ASLC model using a singing annotation data set. Experimental results demonstrate that our transfer learning-based approach outperforms the original ASLC model. By leveraging the readily available musical instrument audio, our method achieves improved performance in enhancing the labeling accuracy of singing audio.
Original language | English |
---|---|
Pages (from-to) | 707-715 |
Number of pages | 9 |
Journal | IEEJ Transactions on Electrical and Electronic Engineering |
Volume | 19 |
Issue number | 5 |
DOIs | |
Publication status | Published - 2024 May |
Keywords
- label calibration
- music information retrieval
- singing annotation
- transfer learning
ASJC Scopus subject areas
- Electrical and Electronic Engineering