Insider Threat Ransomware Detection through ML on PE Files

Main Article Content

Osama Alhodairy, Tarek Abbes

Abstract

Ransomware has become an ever-present and com- plex cyber threat that exposes a large number of companies in all industries to significant ransomware risks. Though traditional ransomware mitigation measures have so far targeted outside dangers, the mushrooming cases of insider-driven strikes pose a new bugbear for cybersecurity practitioners. This paper focuses on the key issue of identifying ransomware insider threats using machine learning on a Windows Portable Executable (PE) file metadata. Our approach draws upon a dataset with no fewer than 138,581 PE files, including both ransomware and benign samples. Powerful feature engineering extracts discriminative properties from PE headers, sections, imports and binary pat- terns. Machine learning algorithms are implemented in this paper for the classification of ransomware insider attacks. 10 ML algorithms were applied to the dataset and we analyised their results, these are supervised learning (Random Forest (RF), XGBoost classifiers, Decision Trees, Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM), K-nearest Neighbor (KNN), AdaBoost, Logistic Regression and Support Vector Machine (SVM)), and a hybrid model consists supervised learning algorithms (Random Forest and XGBoost for classification). The hybrid model did best with a 99.49% prediction and recall when it came to identifying ransomware samples. Among the proposed algorithms the RF algorithm has the largest accuracy of 99.46%, while the rest of the accuracy values are XGBoost of 99.42%, Decision Tree of 99.10%, GBM of 98.91%, LightGBM of 98.89%, KNN of 98.77%, AdaBoost of 98.91%, Logistic Regression of 71.67%, and SVM of 98.91%. Compared with existing solutions, the proposed approach showed significant accuracy and generalization superiority.

Article Details

Section
Articles