keyboard_arrow_up
Comparative Analysis of Clustering Algorithms on Synthetic Circular Patters Data

Authors

Hardev Ranglani, EXL Service Inc, USA

Abstract

Clustering algorithms play a pivotal role in discovering hidden patterns in unlabeled data, but their performance varies significantly across datasets with complex geometries. This paper explores the performance of various clustering techniques in identifying distinct circular clusters within the Synthetic Circle Data Set, a benchmark dataset designed to test algorithms on non-linear structures. We evaluate popular clustering methods, includ- ing k-means, DBSCAN, Gaussian Mixture Models, hierarchical clustering, and emerging techniques like Self Organizing Maps, Mean Shift Clustering and Spectral Clustering. Using metrics such as Adjusted Rand Index (ARI), Normalized Mutual Information (NMI), and Silhouette Score, along with detailed visualizations, we systematically compare the algorithms’ ability to recover the true circle-based clusters without prior labels. Our find- ings highlight the strengths and limitations of each method, revealing that density- and graph-based algorithms consistently outperform traditional techniques like k-means in handling circular patterns.


Keywords

Clustering, K-Means algorithm, Non-linear patterns, Density-Based Clustering, Hierarchical Clustering, Gaussian Mixture Models, Adjusted Rand Index, Spectral Clustering.