TY - GEN
T1 - A fast SVM training method for very large datasets
AU - Li, Boyang
AU - Wang, Qiangwei
AU - Hu, Jinglu
PY - 2009/11/20
Y1 - 2009/11/20
N2 - In a standard support vector machine (SVM), the training process has O(n3) time and O(n2) space complexities, where n is the size of training dataset. Thus, it is computationally infeasible for very large datasets. Reducing the size of training dataset is naturally considered to solve this problem. SVM classifiers depend on only support vectors (SVs) that lie close to the separation boundary. Therefore, we need to reserve the samples that are likely to be SVs. In this paper, we propose a method based on the edge detection technique to detect these samples. To preserve the entire distribution properties, we also use a clustering algorithm such as K-means to calculate the centroids of clusters. The samples selected by edge detector and the centroids of clusters are used to reconstruct the training dataset. The reconstructed training dataset with a smaller size makes the training process much faster, but without degrading the classification accuracies.
AB - In a standard support vector machine (SVM), the training process has O(n3) time and O(n2) space complexities, where n is the size of training dataset. Thus, it is computationally infeasible for very large datasets. Reducing the size of training dataset is naturally considered to solve this problem. SVM classifiers depend on only support vectors (SVs) that lie close to the separation boundary. Therefore, we need to reserve the samples that are likely to be SVs. In this paper, we propose a method based on the edge detection technique to detect these samples. To preserve the entire distribution properties, we also use a clustering algorithm such as K-means to calculate the centroids of clusters. The samples selected by edge detector and the centroids of clusters are used to reconstruct the training dataset. The reconstructed training dataset with a smaller size makes the training process much faster, but without degrading the classification accuracies.
UR - http://www.scopus.com/inward/record.url?scp=70449585520&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449585520&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2009.5178618
DO - 10.1109/IJCNN.2009.5178618
M3 - Conference contribution
AN - SCOPUS:70449585520
SN - 9781424435531
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 1784
EP - 1789
BT - 2009 International Joint Conference on Neural Networks, IJCNN 2009
T2 - 2009 International Joint Conference on Neural Networks, IJCNN 2009
Y2 - 14 June 2009 through 19 June 2009
ER -