Authors
Gustavo EAPA Batista, Ronaldo C Prati, Maria Carolina Monard
Publication date
2004/6/1
Journal
ACM SIGKDD Explorations Newsletter
Volume
6
Issue
1
Pages
20-29
Publisher
ACM
Description
There are several aspects that might influence the performance achieved by existing learning systems. It has been reported that one of these aspects is related to class imbalance in which examples in training data belonging to one class heavily outnumber the examples in the other class. In this situation, which is found in real world data describing an infrequent but important event, the learning system may have difficulties to learn the concept related to the minority class. In this work we perform a broad experimental evaluation involving ten methods, three of them proposed by the authors, to deal with the class imbalance problem in thirteen UCI data sets. Our experiments provide evidence that class imbalance does not systematically hinder the performance of learning systems. In fact, the problem seems to be related to learning with too few minority class examples in the presence of other complicating factors, such …
Total citations
200520062007200820092010201120122013201420152016201720182019202020212022202320241733446679101106129148159164174215299382456494552573292
Scholar articles
E Batista Gustavo, C Prati Ronaldo - A study of the behavior of several methods for …, 2004
GEA PA - RC Prati, MC Monard, A survey of the behavior of …, 2004
B Batista, RC Prati - Balancing Training Data for Automated Annotation of …, 2003
PA GEA - RC Prati, MC Monard, A study of the behaviour of …, 2004
MC Monard, G Batista - Proc. Advances in Logic, Artificial Intelligence and …, 2003