Real-Time Multilingual Speech Recognition and Language Mapping for Indian Code-Switched Speech

Main Article Content

Mayur M. Jani, Sandip R. Panchal, Hemant H. Patel

Abstract

Introduction: In a country like India, where people often switch between languages in everyday conversations, recognizing such mixed-language speech—also called code-switching—is especially tough for standard speech recognition systems. The challenge becomes even greater when the spoken input includes less-resourced Indian languages or uses Roman script.


Objectives:  This work introduces a real-time speech recognition system designed to handle multilingual input—specifically Hindi, Gujarati, and English—without requiring users to choose a language beforehand. The main aim is to simplify the transcription of mixed-language speech while ensuring the output appears in the correct script for each word.


Methods:  The system listens continuously and identifies the language of each word on the fly. It uses bilingual dictionaries to convert Romanized and code-switched words into their proper script forms. The interface is built like a simple Notepad, using Python and Tkinter, and relies on the Google Speech API for transcription. Users can not only transcribe but also save or share the output easily.


Results: Tests show that the tool performs well across a range of sentence types, even when the structure is complex or languages change mid-sentence. It achieves high accuracy in both speech recognition and script conversion, with minimal delay.


Conclusions: By combining real-time processing, automatic transliteration, and an easy-to-use interface, this system fills a crucial gap left by current ASR solutions. It offers a practical way for people in multilingual communities to document, communicate, and share spoken content more effectively.

Article Details

Section
Articles