Authors
Shuo Feng, Jacky Keung, Yan Xiao, Peichang Zhang, Xiao Yu, Xiaochun Cao
Publication date
2024/1/1
Journal
Expert Systems with Applications
Volume
235
Pages
121084
Publisher
Pergamon
Description
The class imbalance problem significantly hinders the ability of the software defect prediction (SDP) models to distinguish between defective (minority class) and non-defective (majority class) software instances. Recent studies on the data resampling technique have shown that Random UnderSampling (RUS) is more effective than several complex oversampling techniques at alleviating this problem. However, RUS blindly removes majority class instances, leading to significant information loss. These studies have also pointed out that the conventional termination condition (i.e., terminating the data resampling technique when the number of instances for both the minority and majority classes are the same) of the data resampling technique can result in suboptimal performance.
In fact, the undersampling technique can be likened to a recommender system or a web search engine that recommends majority class …
Total citations
Scholar articles