Improving IDS Accuracy via XGBoost- Guided Tabular GANs: A Study on Synthetic Attack Data Generation
Main Article Content
Abstract
Intrusion detection systems (IDSs) are crucial components for safeguarding information systems against cyberattacks. Recently, machine learning (ML) has significantly enhanced the effectiveness of IDSs by analyzing patterns in network traffic from training data and applying this knowledge to predict new incoming data. However, training an effective ML model for IDSs requires collecting large datasets of normal and attacking samples. While normal samples can be gathered from the daily operation of network systems, attacking samples are much rarer and harder to collect. To mitigate this problem, numerous studies have been proposed to generate synthesized attacks. Previous studies are often based on training a generative adversarial network to generate synthesized attacks without considering the importance and relevance of the features in the dataset. In this paper, we propose a novel method for generating synthesized attacks based on training two advanced GAN models with weighting the importance of features in the dataset. The proposed methods are extensively evaluated on two IDS datasets. The experimental results demonstrate the effectiveness of the proposed methods in detecting both known and unknown attacks.