Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris Maddison
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3831-3841, 2021.

Abstract

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate our improved sampler for training deep energy-based models on high dimensional discrete image data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-grathwohl21a, title = {Oops I Took A Gradient: Scalable Sampling for Discrete Distributions}, author = {Grathwohl, Will and Swersky, Kevin and Hashemi, Milad and Duvenaud, David and Maddison, Chris}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {3831--3841}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/grathwohl21a/grathwohl21a.pdf}, url = {https://proceedings.mlr.press/v139/grathwohl21a.html}, abstract = {We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate our improved sampler for training deep energy-based models on high dimensional discrete image data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.} }
Endnote
%0 Conference Paper %T Oops I Took A Gradient: Scalable Sampling for Discrete Distributions %A Will Grathwohl %A Kevin Swersky %A Milad Hashemi %A David Duvenaud %A Chris Maddison %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-grathwohl21a %I PMLR %P 3831--3841 %U https://proceedings.mlr.press/v139/grathwohl21a.html %V 139 %X We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate our improved sampler for training deep energy-based models on high dimensional discrete image data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.
APA
Grathwohl, W., Swersky, K., Hashemi, M., Duvenaud, D. & Maddison, C.. (2021). Oops I Took A Gradient: Scalable Sampling for Discrete Distributions. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3831-3841 Available from https://proceedings.mlr.press/v139/grathwohl21a.html.

Related Material