Selecting Optimal k in the k-means Clustering Algorithm
Subject Areas : Journal of Computer & RoboticsMojtaba Jahanian 1 , Abbas Karimi 2 , Faraneh Zarafshan 3
1 - Department of Computer Engineering, Faculty of Engineering, Arak Branch,Islamic Azad, University, IRAN, Akarimi@iau-arak.ac.ir
2 - Department of Computer Engineering, Faculty of Engineering, Arak Branch,Islamic Azad, University, IRAN, Akarimi@iau-arak.ac.ir
3 - Department of Computer Engineering, Faculty of Engineering, Arak Branch, Islamic Azad, University, IRAN, Faraneh@iau-arak.ac.ir
Keywords: Clustering, K-means, clustering algorithms, the optimal number of clusters,
Abstract :
Clustering is one of the essential machine learning algorithms. Data is not labeled in clustering. The most fundamental challenge in clustering algorithms is to choose the correct number of clusters at the beginning of the algorithm. The proper performance of the clustering algorithm depends on selecting the appropriate number of clusters and selecting the optimal right centers. The quality and an optimal number of clusters are essential in algorithm analysis. This article has tried to distinguish our work from other writings by carefully analyzing and comparing existing algorithms and a clear and accurate understanding of all aspects. Also, by comparing other methods using three criteria, the minimum internal distance between points of a cluster and the maximum external distance between clusters and the location of a cluster, we have presented an intelligent method for selecting the optimal number of clusters. In this method, clusters with the lowest error and the lowest internal variance are chosen based on the results obtained from the research.