Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

Zeyuan Allen-Zhu; Zheng Qu; Peter Richtarik; Yang Yuan

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

Zeyuan Allen-Zhu, Zheng Qu, Peter Richtarik, Yang Yuan

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1110-1119, 2016.

Abstract

Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to \sqrtn. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-allen-zhuc16,
  title = 	 {Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling},
  author = 	 {Allen-Zhu, Zeyuan and Qu, Zheng and Richtarik, Peter and Yuan, Yang},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1110--1119},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/allen-zhuc16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/allen-zhuc16.html},
  abstract = 	 {Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to \sqrtn. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.}
}

Endnote

%0 Conference Paper
%T Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling
%A Zeyuan Allen-Zhu
%A Zheng Qu
%A Peter Richtarik
%A Yang Yuan
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-allen-zhuc16
%I PMLR
%P 1110--1119
%U https://proceedings.mlr.press/v48/allen-zhuc16.html
%V 48
%X Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to \sqrtn. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.

RIS


TY  - CPAPER
TI  - Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling
AU  - Zeyuan Allen-Zhu
AU  - Zheng Qu
AU  - Peter Richtarik
AU  - Yang Yuan
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-allen-zhuc16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1110
EP  - 1119
L1  - http://proceedings.mlr.press/v48/allen-zhuc16.pdf
UR  - https://proceedings.mlr.press/v48/allen-zhuc16.html
AB  - Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to \sqrtn. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.
ER  -

APA


Allen-Zhu, Z., Qu, Z., Richtarik, P. & Yuan, Y.. (2016). Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1110-1119 Available from https://proceedings.mlr.press/v48/allen-zhuc16.html.

Related Material

Download PDF