TY - GEN
T1 - Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports
AU - Ren, Hansong
AU - Li, Xuejun
AU - Lei, Liao
AU - Ou, Guoliang
AU - Sun, Hongyu
AU - Wu, Gaofei
AU - Tian, Xiao
AU - Hu, Jinglu
AU - Zhang, Yuqing
N1 - Funding Information:
Keywords: Security breach · Natural language processing · Deep learning · Security vulnerability databases This work was supported by the National Key Research and Development Program of China (2018YFB0804701), the Key Research and Development Program of Hainan Province (ZDYF202012), Guangxi Key Laboratory of Cryptography and Information Security (No. GCIS202123).
Publisher Copyright:
© 2022, Springer Nature Singapore Pte Ltd.
PY - 2022
Y1 - 2022
N2 - At present, the vulnerability database research has mainly focused on whether the disclosed information is accurate. However, the information differences between the various vulnerability databases have received little attention. This article proposes a WITTY (softWare versIon inconsisTency measuremenT sYstem) to detect the differences between the affected software versions of NVD and different language vulnerability databases (including English CVE, OpenWall, Chinese CNNVD, CNVD, and other eight databases). WITTY can enable Our large-scale quantitative information consistency. We introduce named entity recognition (NER) and relation extraction (RE) based on deep learning. We present custom design into named entity recognition (NER) and relation extraction (RE) based on deep learning, enabling WITTY to recognize previously invisible software names and versions based on sentence structure and context. Ground-truth shows that the system has a high accuracy rate (95.3% accuracy rate, 89.9% recall rate). We use data from 8 vulnerability databases in the past 21 years, involving 554,725 vulnerability reports. The results show that they are inconsistent. The software version is prevalent. The average exact match rate of English vulnerability databases CVE, OpenWall, and other vulnerability databases with cve is only 22.1%. The average exact match rate of Chinese CNNVD and CNVD is 49.5%, and the excat match rate of Russian vulnerability databases is 25.8%.
AB - At present, the vulnerability database research has mainly focused on whether the disclosed information is accurate. However, the information differences between the various vulnerability databases have received little attention. This article proposes a WITTY (softWare versIon inconsisTency measuremenT sYstem) to detect the differences between the affected software versions of NVD and different language vulnerability databases (including English CVE, OpenWall, Chinese CNNVD, CNVD, and other eight databases). WITTY can enable Our large-scale quantitative information consistency. We introduce named entity recognition (NER) and relation extraction (RE) based on deep learning. We present custom design into named entity recognition (NER) and relation extraction (RE) based on deep learning, enabling WITTY to recognize previously invisible software names and versions based on sentence structure and context. Ground-truth shows that the system has a high accuracy rate (95.3% accuracy rate, 89.9% recall rate). We use data from 8 vulnerability databases in the past 21 years, involving 554,725 vulnerability reports. The results show that they are inconsistent. The software version is prevalent. The average exact match rate of English vulnerability databases CVE, OpenWall, and other vulnerability databases with cve is only 22.1%. The average exact match rate of Chinese CNNVD and CNVD is 49.5%, and the excat match rate of Russian vulnerability databases is 25.8%.
KW - Deep learning
KW - Natural language processing
KW - Security breach
KW - Security vulnerability databases
UR - http://www.scopus.com/inward/record.url?scp=85126207009&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126207009&partnerID=8YFLogxK
U2 - 10.1007/978-981-19-0523-0_6
DO - 10.1007/978-981-19-0523-0_6
M3 - Conference contribution
AN - SCOPUS:85126207009
SN - 9789811905223
T3 - Communications in Computer and Information Science
SP - 78
EP - 99
BT - Frontiers in Cyber Security - 4th International Conference, FCS 2021, Revised Selected Papers
A2 - Cao, Chunjie
A2 - Zhang, Yuqing
A2 - Hong, Yuan
A2 - Wang, Ding
PB - Springer Science and Business Media Deutschland GmbH
T2 - 4th International Conference on Frontiers in Cyber Security, FCS 2021
Y2 - 17 December 2021 through 19 December 2021
ER -