A method for transcribing vocal expressions such as vibrato, glissando, and kobushi separately from polyphonic music is described. The expressions appear as fluctuation in the fundamental frequency contour of the singing voice. They can be used for search and retrieval of music and for expressive singing voice synthesis based on singing style since they strongly reflect the individuality of the singer. The fundamental frequency contour of the singing voice is estimated using the Viterbi algorithm with limitation from a corresponding note sequence. Next, the notes are aligned with the fundamental frequency sequence temporally. Finally, each expression is identified and parameterized in accordance with designed rules. Experiments demonstrated that this method can transcribe expressions in the singing voice from commercial recordings.
|Title of host publication
|ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
|Institute of Electrical and Electronics Engineers Inc.
|Number of pages
|Published - 2014
|2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence
Duration: 2014 May 4 → 2014 May 9
|2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
|14/5/4 → 14/5/9
- F0 estimation
- Singing voice analysis
- Vocal expression identification / transcription
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering