Detecting Arabic SMS Scam Messages Using a Hybrid Ensemble Machine Learning Algorithms
Main Article Content
Abstract
Short Messaging Service (SMS) remains a critical communication tool, with over five trillion messages sent globally each year. Despite the rise of internet-based messaging services, SMS retains its importance due to its high open rate, with 97% of messages read within fifteen minutes [1]. However, this ubiquity has also made SMS a prime target for scammers. SMS scams, commonly known as "smishing," pose significant threats, leading to widespread financial losses and privacy breaches for millions of users annually. This study proposes a hybrid ensemble learning model for Arabic SMS scam detection, integrating stacking and voting techniques to leverage multiple classifiers. A comprehensive dataset of scam and non-scam Arabic SMS messages was collected and preprocessed to ensure high-quality training data. The selected base models—Logistic Regression, Random Forest, and Gradient Boosting—were trained independently, and their outputs were combined through a meta-learner for final predictions. Experimental results show that the proposed model achieves 91.89% accuracy and a 91.55% F1-score, outperforming traditional classifiers and standalone ensemble models. This approach enhances detection accuracy and provides a more reliable solution for identifying scam messages in Arabic SMS communication.