View article

[PDF] from jmlr.org

A generalized maximum entropy approach to bregman co-clustering and matrix approximation

Authors

Arindam Banerjee, Inderjit Dhillon, Joydeep Ghosh, Srujana Merugu, Dharmendra S Modha

Publication date

2004/8/22

Book

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages

509-514

Description

Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an information-theoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. Analysis of the co-clustering problem leads to the minimum Bregman information principle, which generalizes the maximum entropy principle, and yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology yields new algorithms and also encompasses several previously …

Total citations

Cited by 582

2004200520062007200820092010201120122013201420152016201720182019202020212022202320242 7 17 20 45 25 53 31 47 36 36 40 36 37 24 28 28 20 22 14 6

Scholar articles

A generalized maximum entropy approach to bregman co-clustering and matrix approximation

A Banerjee, I Dhillon, J Ghosh, S Merugu, DS Modha - Proceedings of the tenth ACM SIGKDD international …, 2004