Enhancing COVID-19 Prediction Using Machine Learning: A Comparative Analysis of Feature Selection and Classification Techniques
Main Article Content
Abstract
Introduction: The early and accurate detection of COVID-19 remains a life-threatening challenge in medical analysis. Machine learning is used for predicting disease outcomes based on clinical parameters. This analysis proposes a comparative analysis of feature selection method and classification techniques to enhance COVID-19 detection accuracy using blood biomarkers. We used a pensourse dataset of 1,724 cases, including 35 features. To improve the model performance data preprocessing process included outlier handling, normalization, and transformation techniques to improve model performance. To identify the relevant features, we employed the three-feature selection methods Chi-Square test, Pearson correlation coefficient, and Random Forest. The model prediction accuracy was enhanced using a stacking ensemble classication techniques. The machine learning based classification models effectively predicted COVID-19 infectious disease using blood biomarkers with optimized feature selection techniques.
Objectives: To enhance the accuracy of COVID-19 prediction using machine learning techniques by applying feature selection and classification techniques on blood biomarkers.
Methods: The comparative analysis utilized a publicly available dataset containing 1,724 cases with 35 attributes. Data preprocessing involved outlier handling, normalization, and transformation techniques. Employed Chi-Square test, Pearson correlation coefficient, and Random Forest feature selection techniques. Stacking ensemble classification algorithm was utilized for the better performance of a model.
Results: The classification models demonstrated efficiency in predicting COVID-19 using blood biomarkers. Optimized feature selection significantly improved predictive accuracy, highlighting the importance of selecting relevant features for model performance enhancement.
Conclusions: This study showcases the potential of ML-driven approaches for COVID-19 detection, emphasizing the role of feature selection in improving classification accuracy. The findings contribute to the advancement of diagnostic tools, offering a data-driven solution for rapid and reliable COVID-19 screening.