Authors
Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, Alessandro Panconesi
Publication date
2018/4/16
Journal
ACM Transactions on Knowledge Discovery from Data (TKDD)
Volume
12
Issue
4
Pages
1-25
Publisher
ACM
Description
Counting graphlets is a well-studied problem in graph mining and social network analysis. Recently, several papers explored very simple and natural algorithms based on Monte Carlo sampling of Markov Chains (MC), and reported encouraging results. We show, perhaps surprisingly, that such algorithms are outperformed by color coding (CC) [2], a sophisticated algorithmic technique that we extend to the case of graphlet sampling and for which we prove strong statistical guarantees. Our computational experiments on graphs with millions of nodes show CC to be more accurate than MC; furthermore, we formally show that the mixing time of the MC approach is too high in general, even when the input graph has high conductance. All this comes at a price however. While MC is very efficient in terms of space, CC’s memory requirements become demanding when the size of the input graph and that of the graphlets …
Total citations
20182019202020212022202320242781012149
Scholar articles
M Bressan, F Chierichetti, R Kumar, S Leucci… - ACM Transactions on Knowledge Discovery from Data …, 2018