A Novel Framework for Extracting and Recognizing Text in Scene Images through Self-Attention CNN Enhanced with Fuzzy DNN

Main Article Content

Senu Jerome, Anuj Mohamed

Abstract

Detection and deriving textual information from images help in better understanding of the information it contains. The extracted text can serve as input to many computers vision-based applications. Text retrieval from images of natural settings is quite challenging because of noise interference, poor lighting conditions and obscuration by other objects in the scene. A method for extracting text from images is suggested, employing a combination of Self-Attention Convolutional Neural Network (SAT-CNN) and Fuzzy Deep Neural Networks (DNN). This approach aims to achieve highly accurate text extraction from images by minimizing error rates. The input images undergo efficient preprocessing with a trilateral filter to eliminate noise and enhance image quality. It is crucial to accurately identify and separate the foreground from the scene image to isolate the text after effective feature extraction. Subsequently, the extracted features are fed into the Self-Attention-based Convolutional Neural Network to discern between text and non-text components. During text classification process, the misclassification rate is minimized with the help of metaheuristic Human Mental Search Algorithm (HMSA). After identifying text and non-text components, the character recognition from text is done using Fuzzy Deep Neural Network with Sparse Auto encoder (FDNN-SAE). The proposed framework attains higher accuracy compared to existing methods, such as WNBC–AGSO-TE, HDNN–AGSO-TE, and GAN-TE respectively.

Article Details

Section
Articles