Authors
Tapas Kanungo, David M Mount, Nathan S Netanyahu, Christine D Piatko, Ruth Silverman, Angela Y Wu
Publication date
2002/6/5
Book
Proceedings of the eighteenth annual symposium on Computational geometry
Pages
10-18
Description
In k-means clustering we are given a set of n data points in d-dimensional space ℜd and an integer k, and the problem is to determine a set of k points in ℜd, called centers, to minimize the mean squared distance from each data point to its nearest center. No exact polynomial-time algorithms are known for this problem. Although asymptotically efficient approximation algorithms exist, these algorithms are not practical due to the extremely high constant factors involved. There are many heuristics that are used in practice, but we know of no bounds on their performance.We consider the question of whether there exists a simple and practical approximation algorithm for k-means clustering. We present a local improvement heuristic based on swapping centers in and out. We prove that this yields a (9+ε)-approximation algorithm. We show that the approximation factor is almost tight, by giving an example for which the …
Total citations
20032004200520062007200820092010201120122013201420152016201720182019202020212022202320247121322283245424539413245424252706153695728
Scholar articles
T Kanungo, DM Mount, NS Netanyahu, CD Piatko… - Proceedings of the eighteenth annual symposium on …, 2002