Active Probabilistic Inference on Matrices for PreConditioning in Stochastic Optimization
[edit]
Proceedings of Machine Learning Research, PMLR 89:14481457, 2019.
Abstract
Preconditioning is a wellknown concept that can significantly improve the convergence of optimization algorithms. For noisefree problems, where good preconditioners are not known a priori, iterative linear algebra methods offer one way to efficiently construct them. For the stochastic optimization problems that dominate contemporary machine learning, however, this approach is not readily available. We propose an iterative algorithm inspired by classic iterative linear solvers that uses a probabilistic model to actively infer a preconditioner in situations where Hessianprojections can only be constructed with strong Gaussian noise. The algorithm is empirically demonstrated to efficiently construct effective preconditioners for stochastic gradient descent and its variants. Experiments on problems of comparably low dimensionality show improved convergence. In very highdimensional problems, such as those encountered in deep learning, the preconditioner effectively becomes an automatic learningrate adaptation scheme, which we also show to empirically work well.
Related Material


