TY - GEN
T1 - Automatic evaluation of iconic image retrieval based on colour, shape, and texture
AU - Togashi, Riku
AU - Fujita, Sumio
AU - Sakai, Tetsuya
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/6/8
Y1 - 2020/6/8
N2 - Product image search is required to deal with large target image datasets which are frequently updated, and therefore it is not always practical to maintain exhaustive and up-to-date relevance assessments for tuning and evaluating the search engine. Moreover, in similar product image search where the query is also an image, it is difficult to identify the possible search intents behind it and thereby verbalise the relevance criteria for the assessors, especially if graded relevance assessments are required. In this study, we focus on similar product image search within a given product category (e.g., shoes), wherein each image is iconic (i.e., the image clearly shows what the product looks like and basically nothing else), and propose an initial approach to evaluating the task without relying on manual relevance assessments. More specifically, we build a simple probabilistic model that assumes that an image is generated from latent intents representing shape, texture, and colour, which enables us to estimate the relevance score of each image and thereby compute graded relevance measures for any image search engine result page. Through large-scale crowdsourcing experiments, we demonstrate that our proposed measures, InDCG (which is based on per-intent binary relevance) and D-InDCG (which is based on per-intent graded relevance), align reasonably well with human SERP preferences and with human image preferences. Hence, our automatic measures may be useful at least for rough tuning and evaluation of similar product image search.
AB - Product image search is required to deal with large target image datasets which are frequently updated, and therefore it is not always practical to maintain exhaustive and up-to-date relevance assessments for tuning and evaluating the search engine. Moreover, in similar product image search where the query is also an image, it is difficult to identify the possible search intents behind it and thereby verbalise the relevance criteria for the assessors, especially if graded relevance assessments are required. In this study, we focus on similar product image search within a given product category (e.g., shoes), wherein each image is iconic (i.e., the image clearly shows what the product looks like and basically nothing else), and propose an initial approach to evaluating the task without relying on manual relevance assessments. More specifically, we build a simple probabilistic model that assumes that an image is generated from latent intents representing shape, texture, and colour, which enables us to estimate the relevance score of each image and thereby compute graded relevance measures for any image search engine result page. Through large-scale crowdsourcing experiments, we demonstrate that our proposed measures, InDCG (which is based on per-intent binary relevance) and D-InDCG (which is based on per-intent graded relevance), align reasonably well with human SERP preferences and with human image preferences. Hence, our automatic measures may be useful at least for rough tuning and evaluation of similar product image search.
KW - Evaluation
KW - Graded relevance
KW - Iconic images
KW - Image search
KW - Product search
KW - Retrieval effectiveness
UR - http://www.scopus.com/inward/record.url?scp=85086889504&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086889504&partnerID=8YFLogxK
U2 - 10.1145/3372278.3390741
DO - 10.1145/3372278.3390741
M3 - Conference contribution
AN - SCOPUS:85086889504
T3 - ICMR 2020 - Proceedings of the 2020 International Conference on Multimedia Retrieval
SP - 346
EP - 354
BT - ICMR 2020 - Proceedings of the 2020 International Conference on Multimedia Retrieval
PB - Association for Computing Machinery, Inc
T2 - 10th ACM International Conference on Multimedia Retrieval, ICMR 2020
Y2 - 8 June 2020 through 11 June 2020
ER -