Less than a Single Pass: Stochastically Controlled Stochastic Gradient

Lihua Lei; Michael Jordan

Less than a Single Pass: Stochastically Controlled Stochastic Gradient

Lihua Lei, Michael Jordan

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:148-156, 2017.

Abstract

We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size n; indeed, these costs are independent of n when the target accuracy is small. An experimental evaluation of SCSG on the MNIST dataset shows that it can yield accurate results on this dataset on a single commodity machine with a memory footprint of only 2.6MB and only eight disk accesses.

Cite this Paper

BibTeX


@InProceedings{pmlr-v54-lei17a,
  title = 	 {{Less than a Single Pass: Stochastically Controlled Stochastic Gradient}},
  author = 	 {Lei, Lihua and Jordan, Michael},
  booktitle = 	 {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {148--156},
  year = 	 {2017},
  editor = 	 {Singh, Aarti and Zhu, Jerry},
  volume = 	 {54},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--22 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v54/lei17a/lei17a.pdf},
  url = 	 {https://proceedings.mlr.press/v54/lei17a.html},
  abstract = 	 {We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size n; indeed, these costs are independent of n when the target accuracy is small. An experimental evaluation of SCSG on the MNIST dataset shows that it can yield accurate results on this dataset on a single commodity machine with a memory footprint of only 2.6MB and only eight disk accesses.}
}

Endnote

%0 Conference Paper
%T Less than a Single Pass: Stochastically Controlled Stochastic Gradient
%A Lihua Lei
%A Michael Jordan
%B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2017
%E Aarti Singh
%E Jerry Zhu	
%F pmlr-v54-lei17a
%I PMLR
%P 148--156
%U https://proceedings.mlr.press/v54/lei17a.html
%V 54
%X We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size n; indeed, these costs are independent of n when the target accuracy is small. An experimental evaluation of SCSG on the MNIST dataset shows that it can yield accurate results on this dataset on a single commodity machine with a memory footprint of only 2.6MB and only eight disk accesses.

APA


Lei, L. & Jordan, M.. (2017). Less than a Single Pass: Stochastically Controlled Stochastic Gradient. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:148-156 Available from https://proceedings.mlr.press/v54/lei17a.html.

Less than a Single Pass: Stochastically Controlled Stochastic Gradient

Abstract

Cite this Paper

Related Material