Type-2 Diabetes Detection using XGBoost with ADASYN Over SVM
Main Article Content
Abstract
Type-2 Diabetes Mellitus (T2DM) is a longstanding metabolic disease affecting millions of people worldwide, and early detection is important to avoid serious complications. Conventional clinical diagnosis techniques involve invasive methods and are time-consuming. In this research, we investigate the use of machine learning algorithms to facilitate the early identification of Type-2 Diabetes from patient data. One of the biggest challenges in such medical datasets is class imbalance, where diabetic cases are much fewer compared to non-diabetic cases. To overcome this, we use the Adaptive Synthetic (ADASYN) sampling technique to create synthetic minority samples and enhance classifier sensitivity. We compare the performance of the Support Vector Machine (SVM), a popular baseline algorithm, to that of the Extreme Gradient Boosting (XGBoost) model, renowned for its robustness and precision. Precision, recall, F1-score, and ROC-AUC are the metrics used to measure model performance. Our findings show that XGBoost with ADASYN significantly outperforms the standard SVM classifier, providing a more efficient method for early diabetes detection in imbalanced datasets.