View article

[PDF] from arxiv.org

Bugs in machine learning-based systems: a faultload benchmark

Authors

Mohammad Mehdi Morovati, Amin Nikanjam, Foutse Khomh, Zhen Ming Jiang

Publication date

2023/5

Journal

Empirical Software Engineering

Volume

Issue

Pages

Publisher

Springer US

Description

The rapid escalation of applying Machine Learning (ML) in various domains has led to paying more attention to the quality of ML components. There is then a growth of techniques and tools aiming at improving the quality of ML components and integrating them into the ML-based system safely. Although most of these tools use bugs’ lifecycle, there is no standard benchmark of bugs to assess their performance, compare them and discuss their advantages and weaknesses. In this study, we firstly investigate the reproducibility and verifiability of the bugs in ML-based systems and show the most important factors in each one. Then, we explore the challenges of generating a benchmark of bugs in ML-based software systems and provide a bug benchmark namely defect4ML that satisfies all criteria of standard benchmark, i.e. relevance, reproducibility, fairness, verifiability, and usability. This faultload benchmark contains …

Total citations

Cited by 11

2022202320241 3 7

Scholar articles

Bugs in machine learning-based systems: a faultload benchmark

MM Morovati, A Nikanjam, F Khomh, ZM Jiang - Empirical Software Engineering, 2023