TY - GEN
T1 - Incremental polyphonic audio to score alignment using beat tracking for singer robots
AU - Otsuka, Takuma
AU - Murata, Kazumasa
AU - Nakadai, Kazuhiro
AU - Takahashi, Toru
AU - Komatani, Kazunori
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2009/12/11
Y1 - 2009/12/11
N2 - We aim at developing a singer robot capable of listening to music with its own "ears" and interacting with a human's musical performance. Such a singer robot requires at least three functions: listening to the music, understanding what position in the music is being performed, and generating a singing voice. In this paper, we focus on the second function, that is, the capability to align an audio signal to its musical score represented symbolically. Issues underlying the score alignment problem are: (1) diversity in the sounds of various musical instruments, (2) difference between the audio signal and the musical score, (3) fluctuation in tempo of the musical performance. Our solutions to these issues are as follows: (1) the design of features based on a chroma vector in the 12-tone model and onset of the sound, (2) defining the rareness for each tone based on the idea that scarcely used tone is salient in the audio signal, and (3) the use of a switching Kalman filter for robust tempo estimation. The experimental result shows that our score alignment method improves the average of cumulative absolute errors in score alignment by 29% using 100 popular music tunes compared to the beat tracking without score alignment.
AB - We aim at developing a singer robot capable of listening to music with its own "ears" and interacting with a human's musical performance. Such a singer robot requires at least three functions: listening to the music, understanding what position in the music is being performed, and generating a singing voice. In this paper, we focus on the second function, that is, the capability to align an audio signal to its musical score represented symbolically. Issues underlying the score alignment problem are: (1) diversity in the sounds of various musical instruments, (2) difference between the audio signal and the musical score, (3) fluctuation in tempo of the musical performance. Our solutions to these issues are as follows: (1) the design of features based on a chroma vector in the 12-tone model and onset of the sound, (2) defining the rareness for each tone based on the idea that scarcely used tone is salient in the audio signal, and (3) the use of a switching Kalman filter for robust tempo estimation. The experimental result shows that our score alignment method improves the average of cumulative absolute errors in score alignment by 29% using 100 popular music tunes compared to the beat tracking without score alignment.
UR - http://www.scopus.com/inward/record.url?scp=76249090933&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=76249090933&partnerID=8YFLogxK
U2 - 10.1109/IROS.2009.5354637
DO - 10.1109/IROS.2009.5354637
M3 - Conference contribution
AN - SCOPUS:76249090933
SN - 9781424438044
T3 - 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
SP - 2289
EP - 2296
BT - 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
T2 - 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009
Y2 - 11 October 2009 through 15 October 2009
ER -