Authors
Amol Ghoting, Gregory Buehrer, Srinivasan Parthasarathy, Daehyun Kim, Anthony Nguyen, Yen-Kuang Chen, Pradeep Dubey
Publication date
2005/8/30
Book
Proceedings of the 31st international conference on Very large data bases
Pages
577-588
Description
In this paper, we examine the performance of frequent pattern mining algorithms on a modern processor. A detailed performance study reveals that even the best frequent pattern mining implementations, with highly efficient memory managers, still grossly under-utilize a modern processor. The primary performance bottlenecks are poor data locality and low instruction level parallelism (ILP). We propose a cache-conscious prefix tree to address this problem. The resulting tree improves spatial locality and also enhances the benefits from hardware cache line prefetching. Furthermore, the design of this data structure allows the use of a novel tiling strategy to improve temporal locality. The result is an overall speedup of up to 3.2 when compared with state-of-the-art implementations. We then show how these algorithms can be improved further by realizing a non-naive thread-based decomposition that targets simultaneously multi-threaded processors. A key aspect of this decomposition is to ensure cache re-use between threads that are co-scheduled at a fine granularity. This optimization affords an additional speedup of 50%, resulting in an overall speedup of up to 4.8. To
Total citations
2005200620072008200920102011201220132014201520162017201820192020202120222023202441292088139452411111
Scholar articles
A Ghoting, G Buehrer, S Parthasarathy, D Kim… - Proceedings of the 31st international conference on …, 2005