Authors
Marwan Hassani, Philipp Kranen, Rajveer Saini, Thomas Seidl
Publication date
2014/6/30
Book
Proceedings of the 26th International Conference on Scientific and Statistical Database Management
Pages
1-4
Description
Clustering of high dimensional streaming data is an emerging field of research. A real life data stream imposes many challenges on the clustering task, as an endless amount of data arrives constantly. A lot of research has been done in the full space stream clustering. To handle the varying speeds of the data stream, "anytime" algorithms are proposed but so far only in full space stream clustering. However, data streams from many application domains contain abundance of dimensions; the clusters often exist only in specific subspaces (subset of dimensions) and do not show up in the full feature space. In this paper, the first algorithm that considers both the high dimensionality and the varying speeds of streaming data, is proposed. The algorithm, called SubClusTree, can flexibly adapt to the different stream speeds and makes the best use of available time to provide a high quality subspace clustering. The …
Total citations
20152016201720182019202020212022202335222221
Scholar articles
M Hassani, P Kranen, R Saini, T Seidl - Proceedings of the 26th International Conference on …, 2014