TY - GEN
T1 - An automatic singing impression estimation method using factor analysis and multiple regression
AU - Kanato, Ai
AU - Nakano, Tomoyasu
AU - Goto, Masataka
AU - Kikuchi, Hideaki
PY - 2014/1/1
Y1 - 2014/1/1
N2 - This paper describes a method for estimating the impression of a singing voice via acoustic features. While much research has been conducted on singing impression, to date no method for determining appropriate words to represent the impressions created by a person's singing has been developed, primarily due to the lack of a comprehensive evaluation scale. We followed two steps: construction of such an impression scale, and development of models for estimating the impression score of each word. In the scale construction, two experiments were carried out. Firstly, 44 words were selected as relevant words based on subjective evaluation. Secondly, 12 words were selected as an impression scale, and three factors ("powerful", "cautious", and "cheerful") were extracted by factor analysis. To estimate impression scores, multiple regression models were constructed for each impression word with acoustic features. The models were tested by cross validation. The average R2 value for the 12 words of the complete scale was 0.567, and the R2 for the three factors were 0.863 (powerful), 0.381 (cautious), and 0.603 (cheerful).
AB - This paper describes a method for estimating the impression of a singing voice via acoustic features. While much research has been conducted on singing impression, to date no method for determining appropriate words to represent the impressions created by a person's singing has been developed, primarily due to the lack of a comprehensive evaluation scale. We followed two steps: construction of such an impression scale, and development of models for estimating the impression score of each word. In the scale construction, two experiments were carried out. Firstly, 44 words were selected as relevant words based on subjective evaluation. Secondly, 12 words were selected as an impression scale, and three factors ("powerful", "cautious", and "cheerful") were extracted by factor analysis. To estimate impression scores, multiple regression models were constructed for each impression word with acoustic features. The models were tested by cross validation. The average R2 value for the 12 words of the complete scale was 0.567, and the R2 for the three factors were 0.863 (powerful), 0.381 (cautious), and 0.603 (cheerful).
UR - http://www.scopus.com/inward/record.url?scp=84908873176&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908873176&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84908873176
T3 - Proceedings - 40th International Computer Music Conference, ICMC 2014 and 11th Sound and Music Computing Conference, SMC 2014 - Music Technology Meets Philosophy: From Digital Echos to Virtual Ethos
SP - 1244
EP - 1251
BT - Proceedings - 40th International Computer Music Conference, ICMC 2014 and 11th Sound and Music Computing Conference, SMC 2014 - Music Technology Meets Philosophy
A2 - Kouroupetroglou, Georgios
A2 - Georgaki, Anastasia
PB - National and Kapodistrian University of Athens
T2 - 40th International Computer Music Conference, ICMC 2014, Joint with the 11th Sound and Music Computing Conference, SMC 2014 - Music Technology Meets Philosophy: From Digital Echos to Virtual Ethos
Y2 - 14 September 2014 through 20 September 2014
ER -