TY - GEN
T1 - Quasi-Linear SVM with Local Offsets for High-dimensional Imbalanced Data Classification
AU - Yanze, Li
AU - Ogai, Harutoshi
N1 - Publisher Copyright:
© 2020 The Society of Instrument and Control Engineers - SICE.
PY - 2020/9/23
Y1 - 2020/9/23
N2 - Imbalanced problems often occur in the classification problem. A special case is within-class imbalance, which worsen the imbalance distribution problem and increase the learning concept complexity. Most methods for solving imbalanced data classification focus on finding a globe boundary to solve between-class imbalance problem. My thesis proposes a effective quasi-linear network with local offsets adjustment for imbalanced classification problems. First, we proposed a gated piecewise linear network, an autoencoder-based partitioning method is modified for imbalanced datasets to divide input space into multiple linearly separable partitions along the potential separation boundary. Construct a quasi-linear SVM based on the gated signal that obtained by autoencoder partitioning information. Then training a neural network that let F-score as loss function to generate the local offsets on each local cluster. Finally a quasi-linear SVM classifier with local offsets is constructed for the imbalanced datasets. Our proposed method avoids calculating Euclidean distance, so it can be applied to high dimensional datasets. Simulation results on different real world datasets that our method is effective for imbalanced data classification especially in high-dimensional data.
AB - Imbalanced problems often occur in the classification problem. A special case is within-class imbalance, which worsen the imbalance distribution problem and increase the learning concept complexity. Most methods for solving imbalanced data classification focus on finding a globe boundary to solve between-class imbalance problem. My thesis proposes a effective quasi-linear network with local offsets adjustment for imbalanced classification problems. First, we proposed a gated piecewise linear network, an autoencoder-based partitioning method is modified for imbalanced datasets to divide input space into multiple linearly separable partitions along the potential separation boundary. Construct a quasi-linear SVM based on the gated signal that obtained by autoencoder partitioning information. Then training a neural network that let F-score as loss function to generate the local offsets on each local cluster. Finally a quasi-linear SVM classifier with local offsets is constructed for the imbalanced datasets. Our proposed method avoids calculating Euclidean distance, so it can be applied to high dimensional datasets. Simulation results on different real world datasets that our method is effective for imbalanced data classification especially in high-dimensional data.
KW - F-measure
KW - imbalaced data classification
KW - kernel composition
KW - support vector machine
KW - within-class imbalances
UR - http://www.scopus.com/inward/record.url?scp=85096359749&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096359749&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85096359749
T3 - 2020 59th Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2020
SP - 882
EP - 887
BT - 2020 59th Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 59th Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2020
Y2 - 23 September 2020 through 26 September 2020
ER -