A multiclass hybrid approach to estimating software vulnerability vectors and severity score
Citation
Kekül, H., Ergen, B., & Arslan, H. (2021). A multiclass hybrid approach to estimating software vulnerability vectors and severity score. Journal of Information Security and Applications, 63, 103028.Abstract
Classifying detected software vulnerabilities is an important process. However, the metric values of security vectors are manually determined by humans, which takes time and may introduce errors stemming from human nature. These metrics are important because of their role in the calculation of vulnerability severity. It is necessary to use machine learning algorithms and data mining techniques to improve the quality and speed of vulnerability analysis and discovery processes. However, studies in this area are still limited. In this study, vulnerability vectors were estimated using the natural language processing techniques bag of words, term frequency–inverse document frequency, and n-gram for feature extraction together with various multiclass classification algorithms, namely Naïve Bayes, decision tree, k-nearest neighbors, multilayer perceptron, and random forest. Our experiments using a large public dataset facilitate assessment and provide a standard-compliant prediction model for classifying software vulnerability vectors. The results show that the joint use of different techniques and classification algorithms is a promising solution to a multi-probability and difficult-to-predict problem. In addition, our study fills an important gap in its field in terms of the size of the dataset used and because it covers a vulnerability scoring system version that has not yet been extensively studied.
Source
Journal of Information Security and ApplicationsVolume
63Issue
103028URI
https://www.sciencedirect.com/science/article/abs/pii/S2214212621001939https://hdl.handle.net/20.500.12418/12884