Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics

Difan Zou, Pan Xu, Quanquan Gu
; Proceedings of Machine Learning Research, PMLR 89:2936-2945, 2019.

Abstract

We study stochastic variance reduction-based Langevin dynamic algorithms, SVRG-LD and SAGA-LD \citep{dubey2016variance}, for sampling from non-log-concave distributions. Under certain assumptions on the log density function, we establish the convergence guarantees of SVRG-LD and SAGA-LD in $2$-Wasserstein distance. More specifically, we show that both SVRG-LD and SAGA-LD require $ \tilde O\big(n+n^{3/4}/\epsilon^2 + n^{1/2}/\epsilon^4\big)\cdot \exp\big(\tilde O(d+\gamma)\big)$ stochastic gradient evaluations to achieve $\epsilon$-accuracy in $2$-Wasserstein distance, which outperforms the $ \tilde O\big(n/\epsilon^4\big)\cdot \exp\big(\tilde O(d+\gamma)\big)$ gradient complexity achieved by Langevin Monte Carlo Method \citep{raginsky2017non}. Experiments on synthetic data and real data back up our theory.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-zou19a, title = {Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics}, author = {Zou, Difan and Xu, Pan and Gu, Quanquan}, booktitle = {Proceedings of Machine Learning Research}, pages = {2936--2945}, year = {2019}, editor = {Kamalika Chaudhuri and Masashi Sugiyama}, volume = {89}, series = {Proceedings of Machine Learning Research}, address = {}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/zou19a/zou19a.pdf}, url = {http://proceedings.mlr.press/v89/zou19a.html}, abstract = {We study stochastic variance reduction-based Langevin dynamic algorithms, SVRG-LD and SAGA-LD \citep{dubey2016variance}, for sampling from non-log-concave distributions. Under certain assumptions on the log density function, we establish the convergence guarantees of SVRG-LD and SAGA-LD in $2$-Wasserstein distance. More specifically, we show that both SVRG-LD and SAGA-LD require $ \tilde O\big(n+n^{3/4}/\epsilon^2 + n^{1/2}/\epsilon^4\big)\cdot \exp\big(\tilde O(d+\gamma)\big)$ stochastic gradient evaluations to achieve $\epsilon$-accuracy in $2$-Wasserstein distance, which outperforms the $ \tilde O\big(n/\epsilon^4\big)\cdot \exp\big(\tilde O(d+\gamma)\big)$ gradient complexity achieved by Langevin Monte Carlo Method \citep{raginsky2017non}. Experiments on synthetic data and real data back up our theory.} }
Endnote
%0 Conference Paper %T Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics %A Difan Zou %A Pan Xu %A Quanquan Gu %B Proceedings of Machine Learning Research %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-zou19a %I PMLR %J Proceedings of Machine Learning Research %P 2936--2945 %U http://proceedings.mlr.press %V 89 %W PMLR %X We study stochastic variance reduction-based Langevin dynamic algorithms, SVRG-LD and SAGA-LD \citep{dubey2016variance}, for sampling from non-log-concave distributions. Under certain assumptions on the log density function, we establish the convergence guarantees of SVRG-LD and SAGA-LD in $2$-Wasserstein distance. More specifically, we show that both SVRG-LD and SAGA-LD require $ \tilde O\big(n+n^{3/4}/\epsilon^2 + n^{1/2}/\epsilon^4\big)\cdot \exp\big(\tilde O(d+\gamma)\big)$ stochastic gradient evaluations to achieve $\epsilon$-accuracy in $2$-Wasserstein distance, which outperforms the $ \tilde O\big(n/\epsilon^4\big)\cdot \exp\big(\tilde O(d+\gamma)\big)$ gradient complexity achieved by Langevin Monte Carlo Method \citep{raginsky2017non}. Experiments on synthetic data and real data back up our theory.
APA
Zou, D., Xu, P. & Gu, Q.. (2019). Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics. Proceedings of Machine Learning Research, in PMLR 89:2936-2945

Related Material