Estimating vulnerability metrics with word embedding and multiclass classification methods

Kekul, Hakan; Ergen, Burhan; Arslan, Halil

Estimating vulnerability metrics with word embedding and multiclass classification methods

dc.authorid	ARSLAN, Halil/0000-0003-3286-5159
dc.contributor.author	Kekul, Hakan
dc.contributor.author	Ergen, Burhan
dc.contributor.author	Arslan, Halil
dc.date.accessioned	2024-10-26T18:07:17Z
dc.date.available	2024-10-26T18:07:17Z
dc.date.issued	2024
dc.department	Sivas Cumhuriyet Üniversitesi
dc.description.abstract	Cyber security has an increasing importance since the day when information technologies are an invariable part of modern human life. One of the fundamental areas of cyber security is the concept of software security. Security vulnerabilities in software are one of the main reasons for the exploitation of information systems. For this reason, it has been systematically reported, analyzed and classified for a long time, with a protocol established between the states and the stakeholders of the issue at the level. All these processes are carried out manually by humans today. This situation causes errors and delays caused by human nature. Therefore, the current study aims to help the experts and increase the accuracy of the analysis results by speeding up the processes. To achieve this goal, a model is proposed that uses technical explanations of security reports written in natural language. Our model basically proposes a method that uses word embedding approaches and multi-class classification algorithms from natural language processing techniques. In order to compare the proposed model more accurately, the NVD database, which is open to everyone and accepted as a reference, was chosen. In addition, previous studies in the literature and the model we propose were compared. In order for the results of the compared models to be analyzed more accurately, our model was trained with the data sets of the studies it was compared and the results were presented clearly. The proposed method showed estimation success in the range of 87.34-96.25% for CVSS 2.0 metrics, and in the range of 84-90% for CVSS 3.1. This study, in which different word embedding and classification algorithms are used together, is one of the limited studies on the latest version of the official scoring system used for classification of software security vulnerabilities. Moreover, it is the most comprehensive and original study in its field due to the size of the dataset it uses and the number of databases evaluated.
dc.description.sponsorship	Scientific and Technological Research Council of Turkey (TUBITAK) [121E298]
dc.description.sponsorship	This study is supported by The Scientific and Technological Research Council of Turkey (TUBITAK) with project number 121E298.
dc.identifier.doi	10.1007/s10207-023-00734-7
dc.identifier.endpage	270
dc.identifier.issn	1615-5262
dc.identifier.issn	1615-5270
dc.identifier.issue	1
dc.identifier.scopus	2-s2.0-85167800536
dc.identifier.scopusquality	Q1
dc.identifier.startpage	247
dc.identifier.uri	https://doi.org/10.1007/s10207-023-00734-7
dc.identifier.uri	https://hdl.handle.net/20.500.12418/29427
dc.identifier.volume	23
dc.identifier.wos	WOS:001046583400001
dc.identifier.wosquality	Q2
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Springer
dc.relation.ispartof	International Journal of Information Security
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	Software security
dc.subject	Software vulnerability
dc.subject	Information security
dc.subject	Text analysis
dc.subject	Multiclass classification
dc.title	Estimating vulnerability metrics with word embedding and multiclass classification methods
dc.type	Article

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Estimating vulnerability metrics with word embedding and multiclass classification methods

Dosyalar

Koleksiyon