TY - JOUR
T1 - An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection
AU - Ye, Chaochao
AU - Pan, Julong
AU - Jin, Qun
N1 - Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2019/3
Y1 - 2019/3
N2 - With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.
AB - With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.
KW - Gene selection
KW - Information gain
KW - Pearson correlation coefficient
KW - Simplified swarm optimization
KW - Tumor
UR - http://www.scopus.com/inward/record.url?scp=85055671515&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85055671515&partnerID=8YFLogxK
U2 - 10.1016/j.future.2018.10.008
DO - 10.1016/j.future.2018.10.008
M3 - Article
AN - SCOPUS:85055671515
SN - 0167-739X
VL - 92
SP - 407
EP - 418
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -