A Novel Hybrid Method for Pronominal Anaphora Resolution in Hindi Text
Main Article Content
Abstract
Effective Aanaphora Resolution (AR) is essential for computational linguistics, as it underpins coherent text analysis, information processing pipeline, and the development of advanced language technologies. Pronominal Anaphora Resolution plays a crucial role in analyzing and understanding large text collections, enabling discourse understanding systems and enhancing the performance of related applications like text summarization, sentiment analysis, and machine translation. This paper proposes and evaluates a novel hybrid method for Hindi AR. The proposed method uses the rule-based method to resolve reflexive and locative pronouns, whereas it uses supervised classifiers to resolve demonstrative and relative pronouns. We investigate two machine learning and one deep learning classifiers- the Distributed Random Forest classifier, Stacked ensemble classifier, and Multi-Layer Perceptron. The performance evaluation is done on two standard datasets: the Hindi tourism dataset and the Hindi Dependency Treebank Data (HDTB). The stacked ensemble model outperforms all other models investigated in this paper on the Hindi Tourism dataset with an accuracy of 76.33%. The deep learning model performs better than stacked ensemble and random forest on the HDTB dataset with an overall accuracy of 75.96%. The proposed hybrid method outperforms most of the earlier reported work on Hindi AR. This research demonstrates the potential of DL and ML classifiers in developing an automatic entity linking system in Hindi text, which is necessary for the correct semantic interpretation of text..