Convergence of Langevin MCMC in KL-divergence

Xiang Cheng; Peter Bartlett

Convergence of Langevin MCMC in KL-divergence

Xiang Cheng, Peter Bartlett

Proceedings of Algorithmic Learning Theory, PMLR 83:186-211, 2018.

Abstract

Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density

$\p^*$ is such that

$\log \p^*$ is

$L$ smooth and

$m$ strongly convex, discrete Langevin diffusion produces a distribution

$\p$ with

$\KL{\p}{\p^*}≤ε$ in

$\tilde{O}(\frac{d}{ε})$ steps, where

$d$ is the dimension of the sample space. We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.

Cite this Paper

BibTeX


@InProceedings{pmlr-v83-cheng18a,
  title = 	 {Convergence of Langevin MCMC in KL-divergence},
  author = 	 {Cheng, Xiang and Bartlett, Peter},
  booktitle = 	 {Proceedings of Algorithmic Learning Theory},
  pages = 	 {186--211},
  year = 	 {2018},
  editor = 	 {Janoos, Firdaus and Mohri, Mehryar and Sridharan, Karthik},
  volume = 	 {83},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--09 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v83/cheng18a/cheng18a.pdf},
  url = 	 {https://proceedings.mlr.press/v83/cheng18a.html},
  abstract = 	 {Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density $\p^*$ is such that $\log \p^*$ is $L$ smooth and $m$ strongly convex, discrete Langevin diffusion produces a distribution $\p$ with $\KL{\p}{\p^*}≤ε$ in $\tilde{O}(\frac{d}{ε})$ steps, where $d$ is the dimension of the sample space.  We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.}
}

Endnote

%0 Conference Paper
%T Convergence of Langevin MCMC in KL-divergence
%A Xiang Cheng
%A Peter Bartlett
%B Proceedings of Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2018
%E Firdaus Janoos
%E Mehryar Mohri
%E Karthik Sridharan	
%F pmlr-v83-cheng18a
%I PMLR
%P 186--211
%U https://proceedings.mlr.press/v83/cheng18a.html
%V 83
%X Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density $\p^*$ is such that $\log \p^*$ is $L$ smooth and $m$ strongly convex, discrete Langevin diffusion produces a distribution $\p$ with $\KL{\p}{\p^*}≤ε$ in $\tilde{O}(\frac{d}{ε})$ steps, where $d$ is the dimension of the sample space.  We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics.

APA


Cheng, X. & Bartlett, P.. (2018). Convergence of Langevin MCMC in KL-divergence. Proceedings of Algorithmic Learning Theory, in Proceedings of Machine Learning Research 83:186-211 Available from https://proceedings.mlr.press/v83/cheng18a.html.

Related Material

Download PDF