Authors
Shilin He, Qingwei Lin, Jian-Guang Lou, Hongyu Zhang, Michael R Lyu, Dongmei Zhang
Publication date
2018/10/26
Book
Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering
Pages
60-70
Description
Logs are often used for troubleshooting in large-scale software systems. For a cloud-based online system that provides 24/7 service, a huge number of logs could be generated every day. However, these logs are highly imbalanced in general, because most logs indicate normal system operations, and only a small percentage of logs reveal impactful problems. Problems that lead to the decline of system KPIs (Key Performance Indicators) are impactful and should be fixed by engineers with a high priority. Furthermore, there are various types of system problems, which are hard to be distinguished manually. In this paper, we propose Log3C, a novel clustering-based approach to promptly and precisely identify impactful system problems, by utilizing both log sequences (a sequence of log events) and system KPIs. More specifically, we design a novel cascading clustering algorithm, which can greatly save the clustering …
Total citations
201920202021202220232024112445364123
Scholar articles
S He, Q Lin, JG Lou, H Zhang, MR Lyu, D Zhang - Proceedings of the 2018 26th ACM joint meeting on …, 2018