On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo
[edit]
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:763772, 2018.
Abstract
We provide convergence guarantees in Wasserstein distance for a variety of variancereduction methods: SAGA Langevin diffusion, SVRG Langevin diffusion and controlvariate underdamped Langevin diffusion. We analyze these methods under a uniform set of assumptions on the logposterior distribution, assuming it to be smooth, strongly convex and Hessian Lipschitz. This is achieved by a new proof technique combining ideas from finitesum optimization and the analysis of sampling methods. Our sharp theoretical bounds allow us to identify regimes of interest where each method performs better than the others. Our theory is verified with experiments on realworld and synthetic datasets.
Related Material


