Authors
MARTIN P ROBILLARD, MATHIEU NASSIF, MUHAMMAD SOHAIL
Publication date
2023
Description
Unit tests serve many purposes: they help detect faults, act as documentation, and facilitate debugging activities [16, 36]. The multi-purpose nature of unit tests makes it difficult to define what constitutes a high-quality test. Intuitively, a unit test should measurably fulfill each of its purposes. Devising a metric that captures a test’s effectiveness at achieving all three of its purposes is challenging because the factors influencing each purpose do not necessarily align. For example, using a descriptive name makes a test more effective as documentation and facilitates debugging, but does not affect the test’s ability to detect faults.
Researchers have proposed various metrics to estimate test quality. Most of these metrics evaluate the tests’ ability to detect faults. Of all such metrics, code coverage—the ratio of production code executed by test code—is the most widely researched in prior work and adopted by practitioners. Nevertheless, a recent study revealed that practitioners find code coverage insufficient as a test quality metric [16]. They believe code coverage paints an incomplete picture of