Authors
Arindam Banerjee, Inderjit Dhillon, Joydeep Ghosh, Srujana Merugu, Dharmendra S Modha
Publication date
2004/8/22
Book
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Pages
509-514
Description
Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an information-theoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. Analysis of the co-clustering problem leads to the minimum Bregman information principle, which generalizes the maximum entropy principle, and yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology yields new algorithms and also encompasses several previously …
Total citations
200420052006200720082009201020112012201320142015201620172018201920202021202220232024271720452553314736364036372428282022146
Scholar articles
A Banerjee, I Dhillon, J Ghosh, S Merugu, DS Modha - Proceedings of the tenth ACM SIGKDD international …, 2004