Embedding Learning for Unsupervised Breast Cancer Images Clustering
by Adolphe Andriamanga Ratiarison, Andriamasinoro Rahajaniaina
Published: May 8, 2026 • DOI: 10.51584/IJRIAS.2026.110400082
Abstract
Early detection of breast cancer significantly reduces the number of deaths caused by this disease. In Africa where the number of new cases and deaths is constantly increasing. For Madagascar, very little information is available regarding the number of people affected by this disease. Advances in the application of artificial intelligence in medicine are improving the techniques for detecting this disease. Unfortunately, most of these techniques are cumbersome, complex, and very expensive. In this work, we propose a lightweight, hybrid approach to clustering breast cancer images. Our approach combines deep learning, ArcFace and unsupervised clustering. The architecture relies on the MobileNetV3Small convolutional network as a feature extractor. At the output of the backbone, a projection head is added to transform the feature maps into a compact embedding vector. The goal is to project the data into a low-dimensional (64-dimensional) latent space, where the discriminating properties between classes are strengthened. The use of ArcFace ameliorate intra-class compactness and inter-class separability, enhancing the quality of the learned representations. Two phases of training were adopted: firstly, only the projection layers and the ArcFace layer are trained, with the backbone remaining frozen to stabilize the learning process. Then, partial fine-tuning is performed by unfreezing the final layers of the convolutional neural network. Principal Component Analysis algorithm is used to facilitate the structuring of the embedding in a lower-dimensional space while preserving most of the discriminating information. A comparative study was conducted to evaluate the clustering capabilities of K-Means and HDBSCAN. The overall metrics results show that K-Means provides the best results for all metrics used. Despite the lightweight of our model (3,6 GFLOPs), it achieved a performance comparable to other state-of-the-art approach.