Authors
Amol Ghoting, Konstantin Makarychev
Publication date
2009/6/29
Book
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Pages
827-840
Description
Over the past three decades, the suffix tree has served as a fundamental data structure in string processing. However, its widespread applicability has been hindered due to the fact that suffix tree construction does not scale well with the size of the input string. With advances in data collection and storage technologies, large strings have become ubiquitous, especially across emerging applications involving text, time series, and biological sequence data. To benefit from these advances, it is imperative that we realize a scalable suffix tree construction algorithm.
To deal with the aforementioned challenge, the past few years have seen the emergence of several disk-based suffix tree construction algorithms. However, construction times continue to be daunting -- for e.g., indexing the entire Human genome still takes over 30 hours on a system with 2 gigabytes of physical memory. In this paper, first, we empirically …
Total citations
200920102011201220132014201520162017201820192020202120221471347241331
Scholar articles
A Ghoting, K Makarychev - Proceedings of the 2009 ACM SIGMOD International …, 2009