Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm

Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:1260-1300, 2021.

Abstract

Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as O(d^{1/3}), where d is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is O(d). In this work, we establish that the mixing time of MALA on this class of target distributions is \tilde\Theta(d^{1/2}) under a warm start. Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-chewi21a, title = {Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm}, author = {Chewi, Sinho and Lu, Chen and Ahn, Kwangjun and Cheng, Xiang and Gouic, Thibaut Le and Rigollet, Philippe}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {1260--1300}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/chewi21a/chewi21a.pdf}, url = {https://proceedings.mlr.press/v134/chewi21a.html}, abstract = {Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as O(d^{1/3}), where d is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is O(d). In this work, we establish that the mixing time of MALA on this class of target distributions is \tilde\Theta(d^{1/2}) under a warm start. Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.} }
Endnote
%0 Conference Paper %T Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm %A Sinho Chewi %A Chen Lu %A Kwangjun Ahn %A Xiang Cheng %A Thibaut Le Gouic %A Philippe Rigollet %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-chewi21a %I PMLR %P 1260--1300 %U https://proceedings.mlr.press/v134/chewi21a.html %V 134 %X Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as O(d^{1/3}), where d is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is O(d). In this work, we establish that the mixing time of MALA on this class of target distributions is \tilde\Theta(d^{1/2}) under a warm start. Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.
APA
Chewi, S., Lu, C., Ahn, K., Cheng, X., Gouic, T.L. & Rigollet, P.. (2021). Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:1260-1300 Available from https://proceedings.mlr.press/v134/chewi21a.html.

Related Material