TY - JOUR
T1 - Joint equal contribution of global and local features for image annotation
AU - Sarin, Supheakmungkol
AU - Kameyama, Wataru
PY - 2009/1/1
Y1 - 2009/1/1
N2 - Image annotation is a very important task as the number of photographs has gone sky-high. This paper describes our participation in the ImageCLEF Large Scale Visual Concept Detection and Annotation Task 2009. We present the method used for our best run. Our approach is inspired from a recently proposed method where joint equal contribution (JEC) of simple global color and texture features can outperform the state-of-the-art annotation techniques [10]. Our idea is that if such simple features could do so well, then the combination of higher-level features would do even better. Study has shown that the concurrent use of saliency and gist of the scene is a major trait of human vision system. Therefore, in this preliminary study, we propose to explore the combination of different visual features at global, local and scene levels including global and local color, texture, and gist of the scene. The experiments confirm that higher-level features lead to better performance. Through the experiments, we also found that using 40 nearest neighbors and HSV, HSV (at saliency regions), HAAR, GIST (full scene), GIST (scene at the center) as features produce the best result.We finally identify the weakness in our approach and ways on how the system could be optimized and improved.
AB - Image annotation is a very important task as the number of photographs has gone sky-high. This paper describes our participation in the ImageCLEF Large Scale Visual Concept Detection and Annotation Task 2009. We present the method used for our best run. Our approach is inspired from a recently proposed method where joint equal contribution (JEC) of simple global color and texture features can outperform the state-of-the-art annotation techniques [10]. Our idea is that if such simple features could do so well, then the combination of higher-level features would do even better. Study has shown that the concurrent use of saliency and gist of the scene is a major trait of human vision system. Therefore, in this preliminary study, we propose to explore the combination of different visual features at global, local and scene levels including global and local color, texture, and gist of the scene. The experiments confirm that higher-level features lead to better performance. Through the experiments, we also found that using 40 nearest neighbors and HSV, HSV (at saliency regions), HAAR, GIST (full scene), GIST (scene at the center) as features produce the best result.We finally identify the weakness in our approach and ways on how the system could be optimized and improved.
KW - Automatic image annotation
KW - Color
KW - Gist of scene
KW - Joint equal contribution
KW - K nearest neighbors
KW - Saliency
KW - Texture
UR - http://www.scopus.com/inward/record.url?scp=84922051553&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84922051553&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84922051553
SN - 1613-0073
VL - 1175
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2009 Cross Language Evaluation Forum Workshop, CLEF 2009, co-located with the 13th European Conference on Digital Libraries, ECDL 2009
Y2 - 30 September 2009 through 2 October 2009
ER -