View article

[PDF] from academia.edu

Classifying imbalanced data sets using similarity based hierarchical decomposition

Authors

Cigdem Beyan, Robert Fisher

Publication date

2015/5/1

Journal

Pattern recognition

Volume

Issue

Pages

1653-1672

Publisher

Pergamon

Description

Classification of data is difficult if the data is imbalanced and classes are overlapping. In recent years, more research has started to focus on classification of imbalanced data since real world data is often skewed. Traditional methods are more successful with classifying the class that has the most samples (majority class) compared to the other classes (minority classes). For the classification of imbalanced data sets, different methods are available, although each has some advantages and shortcomings. In this study, we propose a new hierarchical decomposition method for imbalanced data sets which is different from previously proposed solutions to the class imbalance problem. Additionally, it does not require any data pre-processing step as many other solutions need. The new method is based on clustering and outlier detection. The hierarchy is constructed using the similarity of labeled data subsets at each level …

Total citations

Cited by 194

20152016201720182019202020212022202320241 9 18 18 31 34 27 32 17 7

Scholar articles

Classifying imbalanced data sets using similarity based hierarchical decomposition

C Beyan, R Fisher - Pattern recognition, 2015