Authors
Flavio Chierichetti, Nilesh Dalvi, Ravi Kumar
Publication date
2014/8/24
Conference
Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
Pages
641-650
Publisher
ACM
Description
Correlation clustering is a basic primitive in data miner's toolkit with applications ranging from entity matching to social network analysis. The goal in correlation clustering is, given a graph with signed edges, partition the nodes into clusters to minimize the number of disagreements. In this paper we obtain a new algorithm for correlation clustering. Our algorithm is easily implementable in computational models such as MapReduce and streaming, and runs in a small number of rounds. In addition, we show that our algorithm obtains an almost 3-approximation to the optimal correlation clustering. Experiments on huge graphs demonstrate the scalability of our algorithm and its applicability to data mining problems.
Total citations
201420152016201720182019202020212022202320242145121410912149
Scholar articles
F Chierichetti, N Dalvi, R Kumar - Proceedings of the 20th ACM SIGKDD international …, 2014