Predictive Maintenance in Semiconductor Manufacturing Using Machine Learning on Imbalanced Dataset

by Aziz Ahmad, Spogmay Yousafzai, Syed Amir Ali Shah

Published: October 28, 2025 • DOI: 10.51244/IJRSI.2025.1210000018

Abstract

Semiconductor manufacturing produces complex high-dimensional data datasets that contain mostly operational records and show product failure occurrences only in a limited portion. Several research studies use machine learning algorithms for predictive maintenance but very few address the issue of SECOM (imbalanced dataset) which contain up to 93% successful outcomes. This paper explains the existing research gap regarding imbalanced data of SECOM dataset and presents an integrated approach with innovative feature reduction and oversampling algorithms and model optimization methods. Our experiments involving the SECOM Semiconductor Manufacturing process dataset with an initial 591 features were reduced to 63 and processed by PCA which led to the Support Vector Classifier (SVC) producing the most accurate results at 98.6% while maintaining robust calibration. The visualization includes both a correlation heatmap showing related features and pie charts showing class distribution before and after data balancing techniques are applied. This research presents implications for predictive maintenance within semiconductor fabs together with future work recommendations.