Automated Inference with Adaptive Batches

Soham De; Abhay Yadav; David Jacobs; Tom Goldstein

Automated Inference with Adaptive Batches

Soham De, Abhay Yadav, David Jacobs, Tom Goldstein

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:1504-1513, 2017.

Abstract

Classical stochastic gradient methods for optimization rely on noisy gradient approximations that become progressively less accurate as iterates approach a solution. The large noise and small signal in the resulting gradients makes it difficult to use them for adaptive stepsize selection and automatic stopping. We propose alternative “big batch” SGD schemes that adaptively grow the batch size over time to maintain a nearly constant signal-to-noise ratio in the gradient approximation. The resulting methods have similar convergence rates to classical SGD, and do not require convexity of the objective. The high fidelity gradients enable automated learning rate selection and do not require stepsize decay. Big batch methods are thus easily automated and can run with little or no oversight.

Cite this Paper

BibTeX


@InProceedings{pmlr-v54-de17a,
  title = 	 {{Automated Inference with Adaptive Batches}},
  author = 	 {De, Soham and Yadav, Abhay and Jacobs, David and Goldstein, Tom},
  booktitle = 	 {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1504--1513},
  year = 	 {2017},
  editor = 	 {Singh, Aarti and Zhu, Jerry},
  volume = 	 {54},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--22 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v54/de17a/de17a.pdf},
  url = 	 {https://proceedings.mlr.press/v54/de17a.html},
  abstract = 	 {Classical stochastic gradient methods for optimization rely on noisy gradient approximations that become progressively less accurate as iterates approach a solution. The large noise and small signal in the resulting gradients makes it difficult to use them for adaptive stepsize selection and automatic stopping. We propose alternative “big batch” SGD schemes that adaptively grow the batch size over time to maintain a nearly constant signal-to-noise ratio in the gradient approximation. The resulting methods have similar convergence rates to classical SGD, and do not require convexity of the objective. The high fidelity gradients enable automated learning rate selection and do not require stepsize decay. Big batch methods are thus easily automated and can run with little or no oversight.}
}

Endnote

%0 Conference Paper
%T Automated Inference with Adaptive Batches
%A Soham De
%A Abhay Yadav
%A David Jacobs
%A Tom Goldstein
%B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2017
%E Aarti Singh
%E Jerry Zhu	
%F pmlr-v54-de17a
%I PMLR
%P 1504--1513
%U https://proceedings.mlr.press/v54/de17a.html
%V 54
%X Classical stochastic gradient methods for optimization rely on noisy gradient approximations that become progressively less accurate as iterates approach a solution. The large noise and small signal in the resulting gradients makes it difficult to use them for adaptive stepsize selection and automatic stopping. We propose alternative “big batch” SGD schemes that adaptively grow the batch size over time to maintain a nearly constant signal-to-noise ratio in the gradient approximation. The resulting methods have similar convergence rates to classical SGD, and do not require convexity of the objective. The high fidelity gradients enable automated learning rate selection and do not require stepsize decay. Big batch methods are thus easily automated and can run with little or no oversight.

APA


De, S., Yadav, A., Jacobs, D. & Goldstein, T.. (2017). Automated Inference with Adaptive Batches. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:1504-1513 Available from https://proceedings.mlr.press/v54/de17a.html.

Automated Inference with Adaptive Batches

Abstract

Cite this Paper

Related Material