Commonsense-based Visual-Linguistic Reasoning for Document Filtering using Multimodal Large Language Models

Sneha A. Deshmukh

PDF

Published: Jun 19, 2025

Keywords:

Document Understanding, Optical Character Recognition (OCR), Textual Entailment, Vision-Language Pre-training (VLP), Visual-Linguistic Reasoning.

Sneha A. Deshmukh, Satyajit S. Uparkar

Abstract

In many real-world scenarios, users need to sift through large collections of image-based documents to find those containing personal or contextually important information, such as names, email addresses, or phone numbers. Manual filtering is inefficient and error-prone, especially when dealing with unstructured visual data. To address this challenge, we propose an intelligent, automated filtering pipeline that combines cutting-edge techniques from NLP, computer vision, and commonsense reasoning. Our system integrates optical character recognition (OCR) to extract textual content from images, followed by textual entailment models and pattern recognition to understand the relevance of extracted entities in context. A key innovation of our approach is the introduction of Commonsense-based Visual-Linguistic Reasoning (CVLR) — a framework that incorporates knowledge graphs and multimodal large language models (LLMs) to enhance the system’s ability to infer context and intent behind visual information. We fine-tune state-of-the-art multimodal LLMs on a custom dataset of 2,000+ image documents, enabling accurate classification of document types (e.g., invoices, ID cards, certificates) and intelligent filtering based on user-defined relevance criteria. This results in a robust solution capable of identifying documents that matter to the user, even when explicit identifiers are partially obscured or contextually implied.

Issue

Vol. 10 No. 55s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

Commonsense-based Visual-Linguistic Reasoning for Document Filtering using Multimodal Large Language Models

Abstract

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details