Authors
Amol Ghoting, Srinivasan Parthasarathy, Matthew Eric Otey
Publication date
2008/6
Journal
Data Mining and Knowledge Discovery
Volume
16
Pages
349-364
Publisher
Springer US
Description
Defining outliers by their distance to neighboring data points has been shown to be an effective non-parametric approach to outlier detection. In recent years, many research efforts have looked at developing fast distance-based outlier detection algorithms. Several of the existing distance-based outlier detection algorithms report log-linear time performance as a function of the number of data points on many real low-dimensional datasets. However, these algorithms are unable to deliver the same level of performance on high-dimensional datasets, since their scaling behavior is exponential in the number of dimensions. In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional datasets. RBRP scales log-linearly as a function of the number of data points and linearly as a function of the number of dimensions. Our empirical evaluation …
Total citations
2007200820092010201120122013201420152016201720182019202020212022202320241081925161419273026191821241420157
Scholar articles
A Ghoting, S Parthasarathy, ME Otey - Data Mining and Knowledge Discovery, 2008