Classification of Arabic Geographical Research Papers Using Machine Learning Techniques: A Comparative Analysis of TF-IDF and Word2Vec
Main Article Content
Abstract
The classification of Arabic geographical research papers presents a unique challenge due to linguistic complexities and the absence of standardized datasets. In this study, we introduce a novel approach by creating a new dataset, comprising Arabic texts extracted from geographical research papers including research files, abstracts and geographical categories (human or physical geography). After preprocessing and text cleaning, TF-IDF and Word2Vec were employed as feature extraction techniques. Four machine learning models were tested: Naïve Bayes, Logistic Regression, Support Vector Machine (SVM) and Random Forest. Experimental results demonstrated that SVM a The classification of Arabic geographical research papers presents a unique challenge due to linguistic complexities and the absence of standardized datasets. In this study, we introduce a novel approach by creating a new dataset, comprising Arabic texts extracted from geographical research papers including research files, abstracts and geographical categories (human or physical geography).