Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm

Sinho Chewi; Chen Lu; Kwangjun Ahn; Xiang Cheng; Thibaut Le Gouic; Philippe Rigollet

Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm

Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet

Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:1260-1300, 2021.

Abstract

Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as O(d^{1/3}), where d is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is O(d). In this work, we establish that the mixing time of MALA on this class of target distributions is \tilde\Theta(d^{1/2}) under a warm start. Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.

Cite this Paper

BibTeX


@InProceedings{pmlr-v134-chewi21a,
  title = 	 {Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm},
  author =       {Chewi, Sinho and Lu, Chen and Ahn, Kwangjun and Cheng, Xiang and Gouic, Thibaut Le and Rigollet, Philippe},
  booktitle = 	 {Proceedings of Thirty Fourth Conference on Learning Theory},
  pages = 	 {1260--1300},
  year = 	 {2021},
  editor = 	 {Belkin, Mikhail and Kpotufe, Samory},
  volume = 	 {134},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--19 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v134/chewi21a/chewi21a.pdf},
  url = 	 {https://proceedings.mlr.press/v134/chewi21a.html},
  abstract = 	 {Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as O(d^{1/3}), where d is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is O(d). In this work, we establish that the mixing time of MALA on this class of target distributions is \tilde\Theta(d^{1/2}) under a warm start.  Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.}
}

Endnote

%0 Conference Paper
%T Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm
%A Sinho Chewi
%A Chen Lu
%A Kwangjun Ahn
%A Xiang Cheng
%A Thibaut Le Gouic
%A Philippe Rigollet
%B Proceedings of Thirty Fourth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2021
%E Mikhail Belkin
%E Samory Kpotufe	
%F pmlr-v134-chewi21a
%I PMLR
%P 1260--1300
%U https://proceedings.mlr.press/v134/chewi21a.html
%V 134
%X Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as O(d^{1/3}), where d is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is O(d). In this work, we establish that the mixing time of MALA on this class of target distributions is \tilde\Theta(d^{1/2}) under a warm start.  Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.

APA


Chewi, S., Lu, C., Ahn, K., Cheng, X., Gouic, T.L. & Rigollet, P.. (2021). Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:1260-1300 Available from https://proceedings.mlr.press/v134/chewi21a.html.

Related Material

Download PDF