Abstract
Although knowledge discovery and data mining techniques have successfully resolved a lot of real-world applications, classifying an unbalanced data is still full of challenge. The imbalanced data distribution led to the results of classification emphasis on the majority class. In another word, the accuracy for the minority class is often quite low. Traditional classification methods such as ANN, KNN, cannot solve the problem effectively. How to improve the accuracy of unbalanced data classification has attracted growing attention from both academia and industry. The object of this paper is to build a fused method consisting of data scaling, a re-sampling technique and the SVM-RBF-based method to classify a large unbalanced data set obtained in semiconductor industry. It is a classifier model to predict an output result which will be used for producing health control. The experimental results showed that the classification accuracy of the minority class had a great improvement by using SVM-RBF Model. ICIC International
Original language | English |
---|---|
Pages (from-to) | 2419-2424 |
Number of pages | 6 |
Journal | ICIC Express Letters |
Volume | 4 |
Issue number | 6 B |
Publication status | Published - 2010 Dec |
Keywords
- Imbalanced data
- RBF kernel function
- SVM classifier
ASJC Scopus subject areas
- Computer Science(all)
- Control and Systems Engineering