An Ensemble Learning Approach for Cancer Detection

by Prof. (Dr.) Satya Singh, Ratnesh Kumar Sharma

Published: June 11, 2026 • DOI: 10.51244/IJRSI.2026.1305000234

Abstract

Cancer disease classification using high dimensional microarray datasets has become an important research area in healthcare analytics, bioinformatics, and intelligent clinical decision support systems because conventional machine learning approaches frequently experience challenges related to feature redundancy, noisy attributes, overfitting, computational complexity, and reduced predictive stability. This research paper presents an efficient hybrid and ensemble machine learning framework for accurate cancer disease classification using binary and multiclass cancer microarray datasets. The proposed framework integrates advanced feature selection techniques including Recursive Feature Elimination, Maximum Relevance Minimum Redundancy, Boruta, Correlation Feature Selection, and Principal Component Analysis with metaheuristic optimization algorithms such as Ant Colony Optimization, Particle Swarm Optimization, Improved Grey Wolf Optimization, Ant Lion Optimization, and Salp Swarm Optimization for identifying the most informative gene expression features and reducing dimensionality. Furthermore, multiple machine learning classifiers including Support Vector Machine, Random Forest, AdaBoost, XG Boost, Extreme Learning Machine, and ensemble voting approaches are incorporated to improve predictive reliability, robustness, and generalization capability. Experimental analysis performed on lung cancer, colon cancer, prostate cancer, leukemia, breast cancer, ALL-AML, lymphoma, and SRBCT microarray datasets demonstrated significant improvements in classification accuracy, sensitivity, specificity, precision, recall, Matthews Correlation Coefficient, and F1 score compared with conventional machine learning classifiers. The proposed hybrid ensemble framework effectively minimizes misclassification, enhances feature optimization, improves classification stability, and provides a reliable computational approach for intelligent cancer diagnosis, healthcare analytics, and precision clinical decision support systems [1], [2].