Nonconvex sampling with the Metropolis-adjusted Langevin algorithm

Oren Mangoubi; Nisheeth K Vishnoi

Nonconvex sampling with the Metropolis-adjusted Langevin algorithm

Oren Mangoubi, Nisheeth K Vishnoi

Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:2259-2293, 2019.

Abstract

The Langevin Markov chain algorithms are widely deployed methods to sample from distributions in challenging high-dimensional and non-convex statistics and machine learning applications. Despite this, current bounds for the Langevin algorithms are worse than those of competing algorithms in many important situations, for instance when sampling from weakly log-concave distributions, or when sampling or optimizing non-convex log-densities. We obtain improved bounds in many of these situations, showing that the Metropolis-adjusted Langevin algorithm (MALA) is faster than the best bounds for its competitor algorithms when the target distribution satisfies weak third- and fourth- order regularity properties associated with the input data. In many settings, our regularity conditions are weaker than the usual Euclidean operator norm regularity properties, allowing us to show faster bounds for a much larger class of distributions than would be possible with the usual Euclidean operator norm approach, including in statistics and machine learning applications where the data satisfy a certain incoherence condition. In particular, we show that using our regularity conditions one can obtain faster bounds for applications which include sampling problems in Bayesian logistic regression with weakly convex priors, and the nonconvex optimization problem of learning linear classifiers with zero-one loss functions. Our main technical contribution is an analysis of the Metropolis acceptance probability of MALA in terms of its “energy-conservation error," and a bound for this error in terms of third- and fourth- order regularity conditions. The combination of this higher-order analysis of the energy conservation error with the conductance method is key to obtaining bounds which have a sub-linear dependence on the dimension $d$ in the non-strongly logconcave setting.

Cite this Paper

BibTeX


@InProceedings{pmlr-v99-mangoubi19a,
  title = 	 {Nonconvex sampling with the Metropolis-adjusted Langevin algorithm},
  author =       {Mangoubi, Oren and Vishnoi, Nisheeth K},
  booktitle = 	 {Proceedings of the Thirty-Second Conference on Learning Theory},
  pages = 	 {2259--2293},
  year = 	 {2019},
  editor = 	 {Beygelzimer, Alina and Hsu, Daniel},
  volume = 	 {99},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v99/mangoubi19a/mangoubi19a.pdf},
  url = 	 {https://proceedings.mlr.press/v99/mangoubi19a.html},
  abstract = 	 {The Langevin Markov chain algorithms are widely deployed methods to sample from distributions in challenging high-dimensional and non-convex statistics and machine learning applications. Despite this, current bounds for the Langevin algorithms are worse than those of competing algorithms in many important situations, for instance when sampling from weakly log-concave distributions, or when sampling or optimizing non-convex log-densities. We obtain improved bounds in many of these situations, showing that the Metropolis-adjusted Langevin algorithm (MALA) is faster than the best bounds for its competitor algorithms when the target distribution satisfies weak third- and fourth- order regularity properties associated with the input data. In many settings, our regularity conditions are weaker than the usual Euclidean operator norm regularity properties, allowing us to show faster bounds for a much larger class of distributions than would be possible with the usual Euclidean operator norm approach,  including in statistics and machine learning applications where the data satisfy a certain incoherence condition. In particular, we show that using our regularity conditions one can obtain faster bounds for applications which include sampling problems in Bayesian logistic regression with weakly convex priors, and the nonconvex optimization problem of learning linear classifiers with zero-one loss functions. Our main technical contribution is an analysis of the Metropolis acceptance probability of MALA in terms of its “energy-conservation error," and a bound for this error in terms of third- and fourth- order regularity conditions. The combination of this higher-order analysis of the energy conservation error with the conductance method is key to obtaining bounds which have a sub-linear dependence on the dimension $d$ in the non-strongly logconcave setting.}
}

Endnote

%0 Conference Paper
%T Nonconvex sampling with the Metropolis-adjusted Langevin algorithm
%A Oren Mangoubi
%A Nisheeth K Vishnoi
%B Proceedings of the Thirty-Second Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2019
%E Alina Beygelzimer
%E Daniel Hsu	
%F pmlr-v99-mangoubi19a
%I PMLR
%P 2259--2293
%U https://proceedings.mlr.press/v99/mangoubi19a.html
%V 99
%X The Langevin Markov chain algorithms are widely deployed methods to sample from distributions in challenging high-dimensional and non-convex statistics and machine learning applications. Despite this, current bounds for the Langevin algorithms are worse than those of competing algorithms in many important situations, for instance when sampling from weakly log-concave distributions, or when sampling or optimizing non-convex log-densities. We obtain improved bounds in many of these situations, showing that the Metropolis-adjusted Langevin algorithm (MALA) is faster than the best bounds for its competitor algorithms when the target distribution satisfies weak third- and fourth- order regularity properties associated with the input data. In many settings, our regularity conditions are weaker than the usual Euclidean operator norm regularity properties, allowing us to show faster bounds for a much larger class of distributions than would be possible with the usual Euclidean operator norm approach,  including in statistics and machine learning applications where the data satisfy a certain incoherence condition. In particular, we show that using our regularity conditions one can obtain faster bounds for applications which include sampling problems in Bayesian logistic regression with weakly convex priors, and the nonconvex optimization problem of learning linear classifiers with zero-one loss functions. Our main technical contribution is an analysis of the Metropolis acceptance probability of MALA in terms of its “energy-conservation error," and a bound for this error in terms of third- and fourth- order regularity conditions. The combination of this higher-order analysis of the energy conservation error with the conductance method is key to obtaining bounds which have a sub-linear dependence on the dimension $d$ in the non-strongly logconcave setting.

APA


Mangoubi, O. & Vishnoi, N.K.. (2019). Nonconvex sampling with the Metropolis-adjusted Langevin algorithm. Proceedings of the Thirty-Second Conference on Learning Theory, in Proceedings of Machine Learning Research 99:2259-2293 Available from https://proceedings.mlr.press/v99/mangoubi19a.html.

Related Material

Download PDF