Authors
Zeyuan Allen-Zhu, Zheng Qu, Peter Richtárik, Yang Yuan
Publication date
2016/6/11
Conference
ICML 2016: International Conference on Machine Learning
Pages
1110-1119
Description
Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to\sqrtn. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.
Total citations
20152016201720182019202020212022202320242162128283322232211
Scholar articles
Z Allen-Zhu, Z Qu, P Richtárik, Y Yuan - International Conference on Machine Learning, 2016