作者
Qingwei Lin, Ken Hsieh, Yingnong Dang, Hongyu Zhang, Kaixin Sui, Yong Xu, Jian-Guang Lou, Chenggang Li, Youjiang Wu, Randolph Yao, Murali Chintalapati, Dongmei Zhang
发表日期
2018/10/26
图书
Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering
页码范围
480-490
简介
In recent years, many traditional software systems have migrated to cloud computing platforms and are provided as online services. The service quality matters because system failures could seriously affect business and user experience. A cloud service system typically contains a large number of computing nodes. In reality, nodes may fail and affect service availability. In this paper, we propose a failure prediction technique, which can predict the failure-proneness of a node in a cloud service system based on historical data, before node failure actually happens. The ability to predict faulty nodes enables the allocation and migration of virtual machines to the healthy nodes, therefore improving service availability. Predicting node failure in cloud service systems is challenging, because a node failure could be caused by a variety of reasons and reflected by many temporal and spatial signals. Furthermore, the failure …
引用总数
20192020202120222023202461625282811
学术搜索中的文章
Q Lin, K Hsieh, Y Dang, H Zhang, K Sui, Y Xu, JG Lou… - Proceedings of the 2018 26th ACM joint meeting on …, 2018