Authors
Dong Xin, Hong Cheng, Xifeng Yan, Jiawei Han
Publication date
2006/8/20
Book
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Pages
444-453
Description
Observed in many applications, there is a potential need of extracting a small set of frequent patterns having not only high significance but also low redundancy. The significance is usually defined by the context of applications. Previous studies have been concentrating on how to compute top-k significant patterns or how to remove redundancy among patterns separately. There is limited work on finding those top-k patterns which demonstrate high-significance and low-redundancy simultaneously.In this paper, we study the problem of extracting redundancy-aware top-k patterns from a large collection of frequent patterns. We first examine the evaluation functions for measuring the combined significance of a pattern set and propose the MMS (Maximal Marginal Significance) as the problem formulation. The problem is known as NP-hard. We further present a greedy algorithm which approximates the optimal solution …
Total citations
20062007200820092010201120122013201420152016201720182019202020212022202320242614131114912111781013872943
Scholar articles
D Xin, H Cheng, X Yan, J Han - Proceedings of the 12th ACM SIGKDD international …, 2006