Normalizing Sanskrit Texts: An Approach Towards Enhanced Accessibility and Precision

Main Article Content

Sabnam Kumari, Amita Malik

Abstract

Sanskrit text normalisation streamlines inconsistencies in spelling, morphology, and syntax to improve computational text processing and the online availability of ancient texts. This research creates a normalizing pipeline to increase NLP applications' accuracy as well as Text-to-Speech (TTS) system accuracy. In our approach, reducing non-standard words (NSW) increases searchability and understanding. With its 93% accuracy, the model makes clear computational text processing breakthroughs. The project enhances digital humanities by raising the availability of Sanskrit texts for linguistic research and historical studies. The results of this study on the normalisation of Sanskrit text imply that meticulous standardisation of the text considerably increases the efficiency and accuracy of computer text processing. By using basic ideas and methods, the study enhances the capacity for digital searching, analysing, and comprehending of ancient Sanskrit works.. This study unequivocally shows that eliminating non-standard words (NSW) is a necessary step to guarantee the input text follows a standard language form, therefore enhancing performance in NLP tasks and speech synthesis. The work is with accuracy of 93%, precision of 92%, recall of 91%, F1 score of 91%, and specificity of 94%.

Article Details

Section
Articles