An Explainable Sparse Autoencoder–CNN Framework for Robust Cardiovascular Disease Prediction Using Enhanced Feature Representations
by J. Senthilkumar, Tirupatirao Kalipindi, V. Mohanraj, Y. Suresh
Published: January 23, 2026 • DOI: 10.51244/IJRSI.2026.13010010
Abstract
Since cardiovascular diseases (CVDs) are the world's leading cause of death, it is critical to develop prediction frameworks that are reliable, accurate, and easy to understand in order to facilitate prompt clinical decision making. Although many studies have been conducted on traditional machine learning techniques for cardiac risk assessment, their efficacy is frequently constrained by their dependence on manually created features and their restricted capacity to identify intricate non-linear relationships in clinical data. Although deep learning techniques provide better representation learning capabilities, overfitting and interpretability issues limit their efficacy on structured, low-dimensional clinical data.
This paper proposes a novel deep learning framework that uses sparse autoencoders as feature augmentation in conjunction with CNN classification to provide robust prediction of heart disease. The sparse autoencoder allows for generation of enriched latent representations, due to the application of sparsity constraints helping to reveal hidden clinically relevant patterns in tabular patient records. This augmented representation is then reshaped into a structured sequence and passed through a CNN to capture higher-order feature interactions. Furthermore, a multitask learning strategy optimally trains the model to simultaneously optimally reconstruct and classify disease, ultimately improving the model's generalization capability and predictive stability.
The proposed framework has been successfully validated through ten-fold cross-validation with a benchmark dataset established for predicting heart disease. The experimental results demonstrated that the framework produced a classification accuracy of 92%; the proposed framework exceeds both traditional machine learning methods and each of the individual neural networks used previously by other authors. Furthermore, statistical method analysis showed that the improvement seen with the proposed framework was statistically significant. Additionally, the explainability analysis identified risk factors that are clinically relevant for predicting the outcome of the model and will therefore enhance transparency and clinical confidence.
The proposed method provides a scalable, easy to understand, and clinically relevant way of detecting Early Heart Disease (CVD) and supporting decisions related to it.