Deep Learning Hybrid Approach for Accurate SMS Spam Identification
Main Article Content
Abstract
Short messaging service (SMS) is a popular application for mobile devices. People often use SMS when they are not suitable for voice calls. Nowadays, SMS is used for commercial purposes. These SMS can sometimes be useful. But sometimes, unwanted SMS, which is called spam SMS, can disturb mobile phone users. Thus, spam SMS detection becomes an important application for mobile phone service providers. Up to now, machine-learning approaches have been used for spam SMS detection. These approaches used various supervised learning methods for detection purposes. In this paper, three deep-learning approaches are used for SMS spam detection. These approaches are Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and hybrid CNN with LSTM, respectively. The mentioned all deep learning approaches are trained in an end-to-end fashion. As these approaches use numeric input data, the text input data is initially converted to numeric data. To do that, a pre-trained word embedding network is used to convert each SMS text data to an array of word vectors. An SMS spam dataset, which is downloaded from the UCI machine learning repository, is considered in experimental works for the performance evaluation of the mentioned deep learning approaches. Four performance evaluation metrics namely accuracy, precision, recall, and F-score are used in the performance evaluation. The experimental works show that the hybrid approach produces better detection results than the CNN and LSTM approaches. Besides, the result of the hybrid method is compared with some of the published results. Comparisons show that the hybrid method outperforms the other compared methods.