Enhanced Multi-Task CNN For Age, Gender, Race with Mask in Facial Images

by Kimenyi Butera John Bosco, Yonggang Chi

Published: April 15, 2026 • DOI: 10.51244/IJRSI.2026.1303000208

Abstract

Facial attribute analysis is a critical technology for security, human-computer interaction, and public health. However, conventional models that perform tasks like age, gender, and race estimation independently are computationally inefficient and struggle with real-world challenges, particularly facial occlusions such as face masks. This paper proposes an enhanced Multi-Task Convolutional Neural Network(CNN) to address these limitations by simultaneously predicting age, gender, race, and mask presence from a single input image. Our architecture employs a shared ResNet-50 backbone for feature extraction, enhanced with a dedicated attention mechanism to improve robustness against occlusions by focusing on the most relevant facial regions. Task-specific heads with dropout and batch normalisation were integrated to ensure strong generalisation. The model was rigorously evaluated using a comprehensive set of regression and classification metrics. Results demonstrate that our multi-task framework significantly outperforms traditional single-task models, achieving a mask detection accuracy above 95%, a gender classification accuracy exceeding 91%, a race classification accuracy of over 86%, and an age estimation error (MAE) below 6 years. This study confirms that integrating multi-task learning with an occlusion–aware attention mechanism creates a more efficient, accurate, and robust system for facial analysis. The proposed model shows strong potential for deployment in real-world applications where reliability in the presence of occlusions is essential.