Stochastic Adaptive QuasiNewton Methods for Minimizing Expected Values
[edit]
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:41504159, 2017.
Abstract
We propose a novel class of stochastic, adaptive methods for minimizing selfconcordant functions which can be expressed as an expected value. These methods generate an estimate of the true objective function by taking the empirical mean over a sample drawn at each step, making the problem tractable. The use of adaptive step sizes eliminates the need for the user to supply a step size. Methods in this class include extensions of gradient descent (GD) and BFGS. We show that, given a suitable amount of sampling, the stochastic adaptive GD attains linear convergence in expectation, and with further sampling, the stochastic adaptive BFGS attains Rsuperlinear convergence. We present experiments showing that these methods compare favorably to SGD.
Related Material


