Enhancing Twitter Sentiment Classification with a Hybrid Bio-Inspired Feature Selection Approach
Main Article Content
Abstract
Sentiment analysis on Twitter is an essential task for extracting valuable information from public opinions. However, the highly dimensional and noisy nature of text data poses a challenge to achieving high classification accuracy. To address this problem, we propose in this study a hybrid feature selection approach that combines chi-square (a filtering method) with bio-inspired wrapper-based algorithms to improve classification performance. Specifically, we evaluate four hybrid approaches: Chi-Square + Genetic Algorithm (GA), Chi-Square + Particle Swarm Optimization (PSO), Chi-Square + Harris Hawks Optimization (HHO), and Chi-Square + Whale Optimization Algorithm (WOA), where wrapper methods refine features based on machine learning classifiers (KNM evaluator) followed by the use of classifiers (KNN, SVM, NB, RF, LR, MLP) as the base classifier. Experimental results show that the hybrid approach followed by the MLP classifier, achieve good results in terms of (Accuracy, Precision, Recall, and F1-score) for all used datasets (Sentiment140, IMDB, and Us airline tweets) outperforming simple approaches. The result is superior classification accuracy and better selection of feature subsets. These results highlight the effectiveness of integrating statistical filtering with bio-inspired optimization to improve sentiment analysis models by reducing computational complexity and improving predictive performance.