TY - GEN
T1 - SIRSYN
T2 - 15th IEEE International Conference on Knowledge Graph, ICKG 2024
AU - Sutou, Akiyoshi
AU - Wang, Jinfang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Imbalanced data, where class labels in a training dataset are significantly skewed, often reduces the prediction accuracy for minority classes when using traditional algorithms. To address this, various oversampling techniques like SMOTE have been proposed, but they often generate minority data in majority class regions. We propose an improved method, SIRSYN, which applies the Sampling Importance Resampling (SIR) method to existing synthetic data generation techniques. This reduces the risk of generating data in inappropriate locations. Using 60 imbalanced datasets from the KEEL repository, we compared SIRSYN with 13 existing methods. SIRSYN achieved superior performance, G-means and F1 scores, indicating its effectiveness in enhancing oversampling techniques for imbalanced classification tasks.
AB - Imbalanced data, where class labels in a training dataset are significantly skewed, often reduces the prediction accuracy for minority classes when using traditional algorithms. To address this, various oversampling techniques like SMOTE have been proposed, but they often generate minority data in majority class regions. We propose an improved method, SIRSYN, which applies the Sampling Importance Resampling (SIR) method to existing synthetic data generation techniques. This reduces the risk of generating data in inappropriate locations. Using 60 imbalanced datasets from the KEEL repository, we compared SIRSYN with 13 existing methods. SIRSYN achieved superior performance, G-means and F1 scores, indicating its effectiveness in enhancing oversampling techniques for imbalanced classification tasks.
KW - Imbalanced data
KW - Over sampling
KW - SIR
UR - http://www.scopus.com/inward/record.url?scp=86000199786&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=86000199786&partnerID=8YFLogxK
U2 - 10.1109/ICKG63256.2024.00050
DO - 10.1109/ICKG63256.2024.00050
M3 - Conference contribution
AN - SCOPUS:86000199786
T3 - Proceedings - 2024 IEEE International Conference on Knowledge Graph, ICKG 2024
SP - 342
EP - 351
BT - Proceedings - 2024 IEEE International Conference on Knowledge Graph, ICKG 2024
A2 - Chen, Huajun
A2 - Fensel, Anna
A2 - Zhu, Xingquan
A2 - Wattenhofer, Roger
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 December 2024 through 12 December 2024
ER -