Integrating Deep Learning Techniques in Information Retrieval: A Hybrid Approach to Relevance Optimization
Main Article Content
Abstract
Traditional keyword-based information retrieval (IR) systems while effective for exact term matching, often fail to capture semantic meaning, leading to suboptimal relevance especially for complex queries. Studies show that conventional models typically achieve 85–90% accuracy whereas deep learning methods like BERT and DeepCT have reached up to 98.6% accuracy in text retrieval tasks. However, many current implementations do not fully exploit the complementary strengths of neural and lexical techniques. This research addresses that gap by proposing a hybrid IR framework that integrates BM25 with neural embeddings using transformer models and contextual weighting. Using MS-MARCO and TREC-CAR datasets, the methodology includes training neural ranking models, implementing Learning to Rank (LTR) and pseudo-relevance feedback (PRF) and evaluating performance via metrics such as mean average precision (MAP), nDCG and MRR. The hybrid system outperformed traditional models with a 25–30% improvement in recall and a 12% gain in MAP; user satisfaction scores were also 15–20% higher particularly for ambiguous or domain-specific queries. These findings suggest that combining lexical and semantic signals significantly enhances retrieval relevance and user experience. The model's applicability spans enterprise, academic and web search contexts with systems like Vertex AI and Elasticsearch already demonstrating similar performance gains. Future work will explore reducing model complexity for real-time scalability, enhancing interpretability and developing adaptive algorithms that incorporate continuous user feedback for iterative optimization.