Explainable Multimodal Deep Learning Framework to predict cardiovascular disease with heterogeneous clinical data
Main Article Content
Abstract
Cardiovascular disease (CVD) stands as the top cause of death in the global world, with it being a large burden to healthcare systems. Classical risk assessment models like Framingham and SCORE tend to have several constraints: they consider only a few clinical variables, and they are not sufficient to represent the complexity and heterogeneity of modern healthcare data, such as electronic health records (EHR), laboratory values, electrocardiogram (ECG), medical images, and lifestyle determinants. The proposed work hypothesizes a explainable and multimodal deep learning system of realizing accurate and interpretable prediction of CVD through an integration of the various clinical data sources. The model to be proposed uses modality-specialized encoders to derive high-level representations of both structured and unstructured data, then a fusion mechanism is used that gets the interactions across the modalities. The framework is taught and assessed on a multi-institutional data set of more than 10,000 patients, which consists of EHR, laboratory outcomes, ECG, and imaging characteristics. To improve the degree of transparency, explainability methods (SHAP (Shapley Additive Explanations)) and Grad-CAM are also integrated to discover clinically significant attributes to use in predictions. The experimental performance of the proposed approach proves better than the performance of the traditional machine learning, and unimodal deep learning models with AUC, sensitivity, and specificity equal to 0.93, 0.90, and 0.88, respectively. The results say that the framework does not only enhance the predictive accuracy but also offers valuable information regarding the model decisions and it is hence appropriate in application in clinical decision support.