Discrete Langevin Samplers via Wasserstein Gradient Flow

Haoran Sun, Hanjun Dai, Bo Dai, Haomin Zhou, Dale Schuurmans
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:6290-6313, 2023.

Abstract

It is known that gradient based MCMC samplers for continuous spaces, such as Langevin Monte Carlo (LMC), can be derived as particle versions of a gradient flow that minimizes KL divergence on a Wasserstein manifold. The superior efficiency of such samplers has motivated several recent attempts to generalize LMC to discrete spaces. However, a fully principled extension of Langevin dynamics to discrete has yet to be achieved, due to the lack of well-defined gradients in the sample space. In this work, we show how the Wasserstein gradient flow can be generalized naturally to discrete spaces. Given the proposed formulation, we demonstrate how a discrete analogue of Langevin dynamics can subsequently be developed. With this new understanding, we reveal how recent gradient-based samplers in discrete space can be obtained as special cases by choosing particular discretizations. More importantly, the framework also allows for the derivation of novel algorithms, one of which, discrete Langevin Monte Carlo (DLMC), is obtained by a factorized estimate of the transition matrix. The DLMC method admits a convenient parallel implementation and time-uniform sampling that achieves larger jump distances. We demonstrate the advantages of DLMC for sampling and learning in various binary and categorical distributions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-sun23f, title = {Discrete Langevin Samplers via Wasserstein Gradient Flow}, author = {Sun, Haoran and Dai, Hanjun and Dai, Bo and Zhou, Haomin and Schuurmans, Dale}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {6290--6313}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/sun23f/sun23f.pdf}, url = {https://proceedings.mlr.press/v206/sun23f.html}, abstract = {It is known that gradient based MCMC samplers for continuous spaces, such as Langevin Monte Carlo (LMC), can be derived as particle versions of a gradient flow that minimizes KL divergence on a Wasserstein manifold. The superior efficiency of such samplers has motivated several recent attempts to generalize LMC to discrete spaces. However, a fully principled extension of Langevin dynamics to discrete has yet to be achieved, due to the lack of well-defined gradients in the sample space. In this work, we show how the Wasserstein gradient flow can be generalized naturally to discrete spaces. Given the proposed formulation, we demonstrate how a discrete analogue of Langevin dynamics can subsequently be developed. With this new understanding, we reveal how recent gradient-based samplers in discrete space can be obtained as special cases by choosing particular discretizations. More importantly, the framework also allows for the derivation of novel algorithms, one of which, discrete Langevin Monte Carlo (DLMC), is obtained by a factorized estimate of the transition matrix. The DLMC method admits a convenient parallel implementation and time-uniform sampling that achieves larger jump distances. We demonstrate the advantages of DLMC for sampling and learning in various binary and categorical distributions.} }
Endnote
%0 Conference Paper %T Discrete Langevin Samplers via Wasserstein Gradient Flow %A Haoran Sun %A Hanjun Dai %A Bo Dai %A Haomin Zhou %A Dale Schuurmans %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-sun23f %I PMLR %P 6290--6313 %U https://proceedings.mlr.press/v206/sun23f.html %V 206 %X It is known that gradient based MCMC samplers for continuous spaces, such as Langevin Monte Carlo (LMC), can be derived as particle versions of a gradient flow that minimizes KL divergence on a Wasserstein manifold. The superior efficiency of such samplers has motivated several recent attempts to generalize LMC to discrete spaces. However, a fully principled extension of Langevin dynamics to discrete has yet to be achieved, due to the lack of well-defined gradients in the sample space. In this work, we show how the Wasserstein gradient flow can be generalized naturally to discrete spaces. Given the proposed formulation, we demonstrate how a discrete analogue of Langevin dynamics can subsequently be developed. With this new understanding, we reveal how recent gradient-based samplers in discrete space can be obtained as special cases by choosing particular discretizations. More importantly, the framework also allows for the derivation of novel algorithms, one of which, discrete Langevin Monte Carlo (DLMC), is obtained by a factorized estimate of the transition matrix. The DLMC method admits a convenient parallel implementation and time-uniform sampling that achieves larger jump distances. We demonstrate the advantages of DLMC for sampling and learning in various binary and categorical distributions.
APA
Sun, H., Dai, H., Dai, B., Zhou, H. & Schuurmans, D.. (2023). Discrete Langevin Samplers via Wasserstein Gradient Flow. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:6290-6313 Available from https://proceedings.mlr.press/v206/sun23f.html.

Related Material