Convergence of Langevin MCMC in KL-divergence

Xiang Cheng, Peter Bartlett
Proceedings of Algorithmic Learning Theory, PMLR 83:186-211, 2018.

Abstract

Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density $\p^*$ is such that $\log \p^*$ is $L$ smooth and $m$ strongly convex, discrete Langevin diffusion produces a distribution $\p$ with $\KL{\p}{\p^*}≤ε$ in $\tilde{O}(\frac{d}{ε})$ steps, where $d$ is the dimension of the sample space. We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v83-cheng18a, title = {Convergence of Langevin MCMC in KL-divergence}, author = {Cheng, Xiang and Bartlett, Peter}, booktitle = {Proceedings of Algorithmic Learning Theory}, pages = {186--211}, year = {2018}, editor = {Janoos, Firdaus and Mohri, Mehryar and Sridharan, Karthik}, volume = {83}, series = {Proceedings of Machine Learning Research}, month = {07--09 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v83/cheng18a/cheng18a.pdf}, url = {https://proceedings.mlr.press/v83/cheng18a.html}, abstract = {Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density $\p^*$ is such that $\log \p^*$ is $L$ smooth and $m$ strongly convex, discrete Langevin diffusion produces a distribution $\p$ with $\KL{\p}{\p^*}≤ε$ in $\tilde{O}(\frac{d}{ε})$ steps, where $d$ is the dimension of the sample space. We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.} }
Endnote
%0 Conference Paper %T Convergence of Langevin MCMC in KL-divergence %A Xiang Cheng %A Peter Bartlett %B Proceedings of Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2018 %E Firdaus Janoos %E Mehryar Mohri %E Karthik Sridharan %F pmlr-v83-cheng18a %I PMLR %P 186--211 %U https://proceedings.mlr.press/v83/cheng18a.html %V 83 %X Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density $\p^*$ is such that $\log \p^*$ is $L$ smooth and $m$ strongly convex, discrete Langevin diffusion produces a distribution $\p$ with $\KL{\p}{\p^*}≤ε$ in $\tilde{O}(\frac{d}{ε})$ steps, where $d$ is the dimension of the sample space. We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.
APA
Cheng, X. & Bartlett, P.. (2018). Convergence of Langevin MCMC in KL-divergence. Proceedings of Algorithmic Learning Theory, in Proceedings of Machine Learning Research 83:186-211 Available from https://proceedings.mlr.press/v83/cheng18a.html.

Related Material