On-device Keyword Spotting of Odia Language on an Edge Device Using Quantization for Model Compression

Main Article Content

Bikash Ranjan Bag, Manas Ranjan Patra

Abstract

Keyword spotting (KWS) in speech recognition refers to the task of detecting specific words or phrases by a system. KWS focuses on identifying and reacting to particular words or commands. There are various domains where KWS can be applied, such as voice assistants, security systems, automotive systems, healthcare, industrial systems, and various edge devices. In this research, a KWS system has been developed to recognize 24 frequently used words in the Odia language which includes the digits (0 to 9), seven colours and different directions. The dataset is recorded with Zoom H1n mic with 41Khz frequency. This paper proposes three models based on CNN, LSTM, and CNN+LSTM for KWS. Mel Frequency Cepstral Coefficients (MFCCs) are used as features for each keyword. The models are trained and tested with the dataset we prepared. It has been observed that the CNN model performs better than the other two models. The models are compressed using a quantization technique, resulting in a 3x reduction in model size after quantization. The accuracy of the original model (97%) is preserved after quantization. This enables such a model to be deployed on various edge devices. All the models are deployed and tested on the Raspberry Pi 3B board.

Article Details

Section
Articles