TY - GEN
T1 - Fast Support Vector Data Description training using edge detection on large datasets
AU - Hu, Chenlong
AU - Zhou, Bo
AU - Hu, Jinglu
PY - 2014/9/3
Y1 - 2014/9/3
N2 - Support Vector Data Description (SVDD) inherits properties of Support Vector Machines (SVM) and has become a prominent One Class Classifier (OCC). Same to standard SVM, its O (n3) time and O (n2) space complexities, where n is the number of training samples, have become major limitations in cases of large training datasets. As a simple and effective method, reducing the size of training dataset through reserving only samples mostly relevant to learned classifier, can be adopted to overcome the limitations. A trained SVDD enclosed decision boundary always locates on edge area of data distribution and is decided by a small subset of Support Vectors(SVs). Therefore, in this paper, we present a method based on edge detection such that edge samples mostly relevant to decision boundary can be preserved. And clustering techniques are also be applied to keep centroids representing the global distribution properties so as to avoid over-outside of decision boundary. To restrict the influences of noises, each training pattern is assigned with a weight. Experiments on real and artificial data sets prove that the classifier trained on reconstruction training set consisting of edge points and centroids can preserve performance with much faster training speed.
AB - Support Vector Data Description (SVDD) inherits properties of Support Vector Machines (SVM) and has become a prominent One Class Classifier (OCC). Same to standard SVM, its O (n3) time and O (n2) space complexities, where n is the number of training samples, have become major limitations in cases of large training datasets. As a simple and effective method, reducing the size of training dataset through reserving only samples mostly relevant to learned classifier, can be adopted to overcome the limitations. A trained SVDD enclosed decision boundary always locates on edge area of data distribution and is decided by a small subset of Support Vectors(SVs). Therefore, in this paper, we present a method based on edge detection such that edge samples mostly relevant to decision boundary can be preserved. And clustering techniques are also be applied to keep centroids representing the global distribution properties so as to avoid over-outside of decision boundary. To restrict the influences of noises, each training pattern is assigned with a weight. Experiments on real and artificial data sets prove that the classifier trained on reconstruction training set consisting of edge points and centroids can preserve performance with much faster training speed.
UR - http://www.scopus.com/inward/record.url?scp=84908471854&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908471854&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2014.6889718
DO - 10.1109/IJCNN.2014.6889718
M3 - Conference contribution
AN - SCOPUS:84908471854
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 2176
EP - 2182
BT - Proceedings of the International Joint Conference on Neural Networks
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 International Joint Conference on Neural Networks, IJCNN 2014
Y2 - 6 July 2014 through 11 July 2014
ER -