A Progressive Batching LBFGS Method for Machine Learning
[edit]
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:620629, 2018.
Abstract
The standard LBFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasiNewton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, LBFGS is currently not considered an algorithm of choice for largescale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the LBFGS algorithm that combines three basic components  progressive batching, a stochastic line search, and stable quasiNewton updating  and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method.
Related Material


