Authors
Junjie Chen, Xiaoting He, Qingwei Lin, Hongyu Zhang, Dan Hao, Feng Gao, Zhangwei Xu, Yingnong Dang, Dongmei Zhang
Publication date
2019/11/11
Conference
2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE)
Pages
364-375
Publisher
IEEE
Description
In recent years, online service systems have become increasingly popular. Incidents of these systems could cause significant economic loss and customer dissatisfaction. Incident triage, which is the process of assigning a new incident to the responsible team, is vitally important for quick recovery of the affected service. Our industry experience shows that in practice, incident triage is not conducted only once in the beginning, but is a continuous process, in which engineers from different teams have to discuss intensively among themselves about an incident, and continuously refine the incident-triage result until the correct assignment is reached. In particular, our empirical study on 8 real online service systems shows that the percentage of incidents that were reassigned ranges from 5.43% to 68.26% and the number of discussion items before achieving the correct assignment is up to 11.32 on average. To improve …
Total citations
20192020202120222023202411317222012
Scholar articles
J Chen, X He, Q Lin, H Zhang, D Hao, F Gao, Z Xu… - 2019 34th IEEE/ACM International Conference on …, 2019