Comparative Analysis of Feature Selection Methods for Twitter Sentiment Classification

Mohamed Omar

doi:10.52783/jisem.v10i21s.3331

PDF

Published: Mar 14, 2025

DOI: https://doi.org/10.52783/jisem.v10i21s.3331

Keywords:

Classification; Machine learning; Prediction; Sentiment; Analysis; Tweets.

Mohamed Omar, Ahmad Salah, Mahmoud A. Mahdi

Abstract

In this paper, we investigate the performance of machine learning models designed for sentiment analysis of tweets using different featurization techniques: BoW and TF-IDF. With the development of social media at the current rate, it is becoming increasingly important to assess sentiments from short texts written in informal language. We analyze how BoW and TF-IDF impact the performance of different ML approaches, focusing on F1-score, precision, accuracy, and recall as the most important metrics for performance evaluation. We demonstrate via extensive experimentation on a large Twitter dataset that the use of TF-IDF with advanced ML models, such as Random Forest, significantly improves performance. These results demonstrate that the TF-IDF-using models outperform the prior benchmarks, per the reported evaluation metrics to date on the utilized dataset. The results also indicate that the SVM model with a TF-IDF vectorizer has the best performance among all compared models and vectorizer combinations by having the highest accuracy, precision, recall, and F1 score at 98%. Logistic Regression and Random Forest also yield very good performance, especially with Bows and TF-IDF, standing consistently around 92% to 96% for the metrics. The model of Naive Bayes has a more moderate performance of about 81% to 82% with all metrics. Overall, TF-IDF performs best among it is an especially effective vectorizer with SVM, thus making the classification tasks very effective. The developed model using SVM along with TF-IDF in this work outperforms other methods on F1-score and accuracy evaluation metrics.

Issue

Vol. 10 No. 21s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

Comparative Analysis of Feature Selection Methods for Twitter Sentiment Classification

Abstract

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details