Hybrid Neuro-Explainable Ensemble Framework for Early Detection of Parkinson’s Disease Using Speech-Based Acoustic Features

Bavisha Pankaj; Dr. D. Deva Hema; Pranjal Upadhyay; Skanthah Lakshmi Senthilkumar

doi:10.51244/IJRSI.2026.1304000085

Hybrid Neuro-Explainable Ensemble Framework for Early Detection of Parkinson’s Disease Using Speech-Based Acoustic Features

by Bavisha Pankaj, Dr. D. Deva Hema, Pranjal Upadhyay, Skanthah Lakshmi Senthilkumar

Published: May 2, 2026 • DOI: 10.51244/IJRSI.2026.1304000085

Abstract

Early-Onset Parkinson’s Disease (PD) is a major clinical concern due to the continuous neurological degeneration and slight prodromal symptoms. The degeneration of dopaminergic neurons in patients with Parkinson’s Disease directly influences both motor and vocal activities. The acoustic degradation in patients’ voices is more pronounced than other motor activities. Vocal parameters such as jitter, shimmer, and harmonics to noise ratio have exhibited high potential for early Parkinson’s Disease detection. However, supervised learning models such as Logistic Regression and Gradient Boosting are unable to capture the non-linear variability in pathological speech. In this regard, a novel framework known as the Hybrid Neuro-Explainable Ensemble Framework (HNEF) is proposed. The framework integrates two supervised learning models, namely Regularized Logistic Regression and Gradient Boosting, using a weighted soft voting approach. The framework can capture linear and non-linear decision boundaries. Moreover, a novel hybrid oversampling technique is incorporated to tackle the common class imbalance in Parkinson’s Disease datasets. The technique combines K-Means-based synthetic minority oversampling and density-sensitive oversampling. The relevance of features is determined via a sequential two-stage pipeline consisting of Recursive Feature Elimination and Mutual Information scoring, thus ensuring the preservation of the most diagnostically relevant vocal features. The prediction pipeline incorporates features of interpretability via the SHAP-based global attribution and the attention mechanism, thus ensuring accountability in the telemedicine and AI-assisted clinical settings. The experimental results, considering the standard PD speech dataset, show the efficacy of HNEF with a classification accuracy of 97.8%, an F1-score of 97.1%, and an AUC-ROC value of 0.99, thus outperforming all the individual baseline models, including SVM, Random Forest, XGBoost, and deep neural networks. The ten-fold cross-validation results show the robustness of the findings with a high accuracy of 97.5% ± 0.5%. The potential of HNEF will be explored for the extension of the system to speech monitoring for tracking disease progression, its integration with multimodal biomarkers including gait and EEG, and its prospective validation with demographically diverse patients for achieving regulatory approval.

Download PDF