TY - GEN
T1 - Singer identification based on accompaniment sound reduction and reliable frame selection
AU - Fujihara, Hiromasa
AU - Kitahara, Tetsuro
AU - Goto, Masataka
AU - Komatani, Kazunori
AU - Ogata, Tetsuya
AU - Okuno, Hiroshi G.
PY - 2005/12/1
Y1 - 2005/12/1
N2 - This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in musical pieces with a vocal part, the identification of singer names is useful for music information retrieval systems. The main problem in automatically identifying singers is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection. The former method makes it possible to identify the singer of a singing voice after reducing accompaniment sounds. It first extracts harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by those components. The latter method then judges whether each frame of the obtained melody is reliable (i.e. little influenced by accompaniment sound) or not by using two Gaussian mixture models for vocal and non-vocal frames. It enables the singer identification using only reliable vocal portions of musical pieces. Experimental results with forty popular-music songs by ten singers showed that our method was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95%, while the accuracy for a conventional method was 53%.
AB - This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in musical pieces with a vocal part, the identification of singer names is useful for music information retrieval systems. The main problem in automatically identifying singers is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection. The former method makes it possible to identify the singer of a singing voice after reducing accompaniment sounds. It first extracts harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by those components. The latter method then judges whether each frame of the obtained melody is reliable (i.e. little influenced by accompaniment sound) or not by using two Gaussian mixture models for vocal and non-vocal frames. It enables the singer identification using only reliable vocal portions of musical pieces. Experimental results with forty popular-music songs by ten singers showed that our method was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95%, while the accuracy for a conventional method was 53%.
KW - Artist identification
KW - Melody extraction
KW - Similarity-based MIR
KW - Singer identification
KW - Singing detection
UR - http://www.scopus.com/inward/record.url?scp=84873533890&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84873533890&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84873533890
SN - 9780955117909
T3 - ISMIR 2005 - 6th International Conference on Music Information Retrieval
SP - 329
EP - 336
BT - ISMIR 2005 - 6th International Conference on Music Information Retrieval
T2 - 6th International Conference on Music Information Retrieval, ISMIR 2005
Y2 - 11 September 2005 through 15 September 2005
ER -