Hybrid CNN-BiLSTM with CTC for Enhanced Text Recognition in Complex Background Images

Rakesh T M

doi:10.52783/jisem.v10i50s.10121

PDF

Published: May 27, 2025

DOI: https://doi.org/10.52783/jisem.v10i50s.10121

Keywords:

Complex Background, OCR, CTC decoder, Bi-LSTM, CNN

Rakesh T M, Girisha G S

Abstract

The problems that robotic reading of text faces such as poor light, messy backgrounds and blurriness, resemble those found in human vision. Addressing these concerns results in applications such as document digitization and assistive technology. The study introduces a way to help identify text by joining CNNs, BiLSTMs and a CTC decoder. This CNN part is able to detect spatial features of text even from crowded images, while BiLSTMs help recognize text printed in different styles, turned over and in varying sizes. Because the CTC decoder does not require separate segmentation of characters, the text is aligned accurately. On ICDAR 2015 and SVT datasets, the approach demonstrated by this study shows very high accuracy of 98.50% and 98.80%. Quality measurements reveal high accuracy of the model on motion-blurred (no more than 15 pixels), partially occluded (40%) and distorted (half of text is skewed by up to 30 degrees) images. It proposes a method that helps to identify text by using CNNs, BiLSTMs and a CTC decoder.

Issue

Vol. 10 No. 50s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

Hybrid CNN-BiLSTM with CTC for Enhanced Text Recognition in Complex Background Images

Abstract

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details