Detecting Adverse Drug Reactions from Twitter Data Using Natural Language Processing and Deep Learning
Main Article Content
Abstract
Adverse Drug Reactions (ADRs) present major challenges to patient safety, necessitating timely and precise identification to enhance pharmacovigilance initiatives. Traditional ADR reporting systems suffer from underreporting and delays, prompting the need for alternative data sources such as social media. However, extracting meaningful insights from unstructured and noisy social media text presents substantial challenges. This research proposes novel Deep Convolutional Recurrent Semantic Similarity Model (DCR-SSM), which integrates convolutional and recurrent layers with a semantic similarity mechanism and attention module to enhance ADR detection from Twitter data. The framework incorporates a robust Preprocessing pipeline tailored to social media text, along with Decision Tree-based feature selection and Bag-of-Words encoding to capture relevant linguistic and semantic features. Comprehensive experiments performed on SMM4H dataset illustrate superiority of proposed model compared to leading ADR detection techniques. DCR-SSM acquired an accuracy (72%), precision (75%), recall (72%), and an F1-score (73%), outperforming traditional machine learning (SVM) and (LSTM, Bi-LSTM, CNN) deep learning models. In contrast to best-performing existing models, the proposed framework improves precision by up to 5.2% and maintains a balanced trade-off between recall and F1-score, ensuring better generalization in real life applications. Findings highlight potential in leveraging NLP as well as deep learning for mining patient-reported ADRs from social media, offering a scalable and cost-effective alternative to conventional pharmacovigilance methods. Future research can explore multi-lingual ADR detection and domain-specific embedding further to enhance detection accuracy and adaptability across diverse healthcare settings