TY - GEN
T1 - Retrieved Image Refinement by Bootstrap Outlier Test
AU - Watanabe, Hayato
AU - Hino, Hideitsu
AU - Akaho, Shotaro
AU - Murata, Noboru
N1 - Funding Information:
Partly supported by JST CREST JPMJCR1761, JSPS KAKENHI 17H01748,17H02953 and 19H04113.
Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - Outlier detection is used to identify data points or a small number of subsets of data that are significantly different from most other data in a given dataset. It is challenging to detect outliers using an objective and quantitative approach. Methods that use the framework of statistical hypothesis testing are widely used by assuming a specific parametric distribution as a data generation model, but there is no guarantee that the distribution of data can be adequately approximated by a parametric distribution in practical problems. In this paper, a simple method is proposed to objectively detect outliers by hypothesis testing without assuming a specific distribution of outlier scores. By using an arbitrary outlier score function, hypothesis testing is used to determine whether each given sample is an outlier. The distribution of the test statistics is needed for the hypothesis test, and is estimated based on the given data using the bootstrap method. The effectiveness of the proposed outlier test was verified by applying it to outlier detection for text-based image retrieval, where it improved the quality of image searches by removing irrelevant images.
AB - Outlier detection is used to identify data points or a small number of subsets of data that are significantly different from most other data in a given dataset. It is challenging to detect outliers using an objective and quantitative approach. Methods that use the framework of statistical hypothesis testing are widely used by assuming a specific parametric distribution as a data generation model, but there is no guarantee that the distribution of data can be adequately approximated by a parametric distribution in practical problems. In this paper, a simple method is proposed to objectively detect outliers by hypothesis testing without assuming a specific distribution of outlier scores. By using an arbitrary outlier score function, hypothesis testing is used to determine whether each given sample is an outlier. The distribution of the test statistics is needed for the hypothesis test, and is estimated based on the given data using the bootstrap method. The effectiveness of the proposed outlier test was verified by applying it to outlier detection for text-based image retrieval, where it improved the quality of image searches by removing irrelevant images.
KW - Hypothesis testing
KW - Image retrieval
KW - Outlier removal
UR - http://www.scopus.com/inward/record.url?scp=85072871968&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072871968&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-29888-3_41
DO - 10.1007/978-3-030-29888-3_41
M3 - Conference contribution
AN - SCOPUS:85072871968
SN - 9783030298876
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 505
EP - 517
BT - Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings
A2 - Vento, Mario
A2 - Percannella, Gennaro
PB - Springer Verlag
T2 - 18th International Conference on Computer Analysis of Images and Patterns, CAIP 2019
Y2 - 3 September 2019 through 5 September 2019
ER -