Web Based Application for Early Detection of Thyroid Disorders in Nigeria

by Emmanuel O. Ayodele, Iyinoluwa T. Idowu, Peter A. Idowu, Peter S. Idoko

Published: April 11, 2026 • DOI: 10.51584/IJRIAS.2026.11030067

Abstract

Thyroid gland disorders represent a significant public health challenge globally, with a particularly pronounced burden in low- and middle-income countries like Nigeria. This paper focuses on selecting best features for early detection of Thyroid disorder in Nigeria using machine learning approach. In machine learning, feature selection is crucial to designing a good model and obtaining the best model performances. The redundant and undesired features may need to be removed from the original datasets to train the model faster, easily interpret the data, and avoid overfitting problems. This paper focuses on a robust ML-based selective features for prediction of early detection of thyroid gland disorders in Nigeria, leveraging clinical data (TSH, T3, T4, autoantibodies), ultrasound findings, demographic variables (age, sex, BMI), and environmental factors (iodine status, goitrogen exposure). This study employs a dual-pronged approach to feature selection, combining filter-based methods with Random Forest techniques to ensure comprehensive identification of the most predictive variables. The result showed that Random Forest and Gradient Boosting delivered superior results, with Random Forest slightly outperforming Gradient Boosting. Using all features, Random Forest achieved accuracy = 0.9978, precision = 0.9986, recall = 0.9971, F1-score = 0.9978, and ROC-AUC = 0.9999, indicating near-perfect discrimination. Gradient Boosting closely followed with similar metrics (accuracy = 0.9971, ROC-AUC = 0.9999). In conclusion, the comparative analysis confirms that Random Forest and Gradient Boosting offer the most reliable and accurate predictions, benefiting from their ensemble architecture and ability to model complex interactions.