Authors
Irene Ntoutsi, Arthur Zimek, Themis Palpanas, Peer Kröger, Hans-Peter Kriegel
Publication date
2012/4/26
Book
Proceedings of the 2012 SIAM international conference on data mining
Pages
987-998
Publisher
Society for Industrial and Applied Mathematics
Description
Clustering of high dimensional data streams is an important problem in many application domains, a prominent example being network monitoring. Several approaches have been lately proposed for solving independently the different aspects of the problem. There exist methods for clustering over full dimensional streams and methods for finding clusters in subspaces of high dimensional static data. Yet only a few approaches have been proposed so far which tackle both the stream and the high dimensionality aspects of the problem simultaneously. In this work, we propose a new density-based projected clustering algorithm, HDDSTREAM, for high dimensional data streams. Our algorithm summarizes both the data points and the dimensions where these points are grouped together and maintains these summaries online, as new points arrive over time and old points expire due to ageing. Our experimental results …
Total citations
20122013201420152016201720182019202020212022202320242713171711121614111394
Scholar articles
I Ntoutsi, A Zimek, T Palpanas, P Kröger, HP Kriegel - Proceedings of the 2012 SIAM international conference …, 2012