Lost Relatives of the Gumbel Trick

Matej Balog, Nilesh Tripuraneni, Zoubin Ghahramani, Adrian Weller
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:371-379, 2017.

Abstract

The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. We derive an entire family of related methods, of which the Gumbel trick is one member, and show that the new methods have superior properties in several settings with minimal additional computational cost. In particular, for the Gumbel trick to yield computational benefits for discrete graphical models, Gumbel perturbations on all configurations are typically replaced with so-called low-rank perturbations. We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution. Finally, we balance the discussion by showing how the simpler analytical form of the Gumbel trick enables additional theoretical results.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-balog17a, title = {Lost Relatives of the {G}umbel Trick}, author = {Matej Balog and Nilesh Tripuraneni and Zoubin Ghahramani and Adrian Weller}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {371--379}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/balog17a/balog17a.pdf}, url = { http://proceedings.mlr.press/v70/balog17a.html }, abstract = {The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. We derive an entire family of related methods, of which the Gumbel trick is one member, and show that the new methods have superior properties in several settings with minimal additional computational cost. In particular, for the Gumbel trick to yield computational benefits for discrete graphical models, Gumbel perturbations on all configurations are typically replaced with so-called low-rank perturbations. We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution. Finally, we balance the discussion by showing how the simpler analytical form of the Gumbel trick enables additional theoretical results.} }
Endnote
%0 Conference Paper %T Lost Relatives of the Gumbel Trick %A Matej Balog %A Nilesh Tripuraneni %A Zoubin Ghahramani %A Adrian Weller %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-balog17a %I PMLR %P 371--379 %U http://proceedings.mlr.press/v70/balog17a.html %V 70 %X The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. We derive an entire family of related methods, of which the Gumbel trick is one member, and show that the new methods have superior properties in several settings with minimal additional computational cost. In particular, for the Gumbel trick to yield computational benefits for discrete graphical models, Gumbel perturbations on all configurations are typically replaced with so-called low-rank perturbations. We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution. Finally, we balance the discussion by showing how the simpler analytical form of the Gumbel trick enables additional theoretical results.
APA
Balog, M., Tripuraneni, N., Ghahramani, Z. & Weller, A.. (2017). Lost Relatives of the Gumbel Trick. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:371-379 Available from http://proceedings.mlr.press/v70/balog17a.html .

Related Material