Semantic Analysis of Kannada Summaries Using Machine Learning
Main Article Content
Abstract
The growing prevalence of digital content in regional languages, especially Kannada, has created a significant demand for sentiment analysis tools tailored to these languages. This research aims to develop an efficient machine learning model capable of classifying Kannada text into emotion categories such as happiness, sadness, anger, and fear. The project utilizes a RandomForestClassifier trained on a carefully curated dataset of Kannada sentences, enhanced through text preprocessing techniques like tokenization, stemming, and stopword removal. To address the bilingual nature of users, the system also integrates Google Translate to support EnglishtoKannada translation, ensuring seamless sentiment analysis for both languages. A webbased interface built using Flask allows realtime sentiment predictions and presents results with model accuracy. The system achieved promising results, with high accuracy in categorizing Kannada sentiments. The framework of this project lays the groundwork for extending sentiment analysis to other regional Indian languages. Future directions include expanding the dataset, incorporating deep learning techniques such as LSTMs or BERT, and implementing realtime sentiment analysis for broader scalability.