Certified Adversarial Robustness in Deep Learning Via Differential Privacy and Ensemble Training

by Charles Roland Haruna, Edmund Ofei Ayeh, Kwame Opuni-Boachie Obour Agyekum, Maame Gyamfua Asante-Mensah, Obed Tettey Nartey, Pius Kwao Gadosey

Published: May 22, 2026 • DOI: 10.51244/IJRSI.2026.1305000026

Abstract

Deep learning models remain susceptible to adversarial attacks, posing serious risks in safety-critical applications such as autonomous driving and medical diagnosis. This study introduces the Certified Robustness Differential Privacy (CRDP) framework, which integrates differential privacy (DP) with ensemble adversarial training to enhance robustness while preserving accuracy. CRDP employs DP noise mechanisms (Laplace and Gaussian) and dynamic adversarial mixing, optimizing the robustness-accuracy trade-off through principled noise calibration. Experiments on CIFAR-10 and MNIST demonstrate that the ensemble model achieves 99.12% accuracy under adversarial attack at ε = 0.5, surpassing single-model baselines by 1.84 percentage points. CRDP further attains a certified accuracy of 80% using Laplace noise (ε = 0.5), outperforming Gaussian noise alternatives under equivalent privacy budgets. Projected Gradient Descent (PGD)-based adversarial training additionally enhances resilience against iterative attacks. These findings confirm the advantage of Laplace noise in strengthening certified security guarantees while maintaining competitive model performance. This work unifies theoretical privacy guarantees with empirical validation, providing actionable strategies for deploying robust deep learning models in adversarial environments.