Rao-Blackwellized Stochastic Gradients for Discrete Distributions

Runjing Liu, Jeffrey Regier, Nilesh Tripuraneni, Michael Jordan, Jon Mcauliffe
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:4023-4031, 2019.

Abstract

We wish to compute the gradient of an expectation over a finite or countably infinite sample space having K $\leq$ $\infty$ categories. When K is indeed infinite, or finite but very large, the relevant summation is intractable. Accordingly, various stochastic gradient estimators have been proposed. In this paper, we describe a technique that can be applied to reduce the variance of any such estimator, without changing its bias{—}in particular, unbiasedness is retained. We show that our technique is an instance of Rao-Blackwellization, and we demonstrate the improvement it yields on a semi-supervised classification problem and a pixel attention task.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-liu19c, title = {Rao-Blackwellized Stochastic Gradients for Discrete Distributions}, author = {Liu, Runjing and Regier, Jeffrey and Tripuraneni, Nilesh and Jordan, Michael and Mcauliffe, Jon}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {4023--4031}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/liu19c/liu19c.pdf}, url = {https://proceedings.mlr.press/v97/liu19c.html}, abstract = {We wish to compute the gradient of an expectation over a finite or countably infinite sample space having K $\leq$ $\infty$ categories. When K is indeed infinite, or finite but very large, the relevant summation is intractable. Accordingly, various stochastic gradient estimators have been proposed. In this paper, we describe a technique that can be applied to reduce the variance of any such estimator, without changing its bias{—}in particular, unbiasedness is retained. We show that our technique is an instance of Rao-Blackwellization, and we demonstrate the improvement it yields on a semi-supervised classification problem and a pixel attention task.} }
Endnote
%0 Conference Paper %T Rao-Blackwellized Stochastic Gradients for Discrete Distributions %A Runjing Liu %A Jeffrey Regier %A Nilesh Tripuraneni %A Michael Jordan %A Jon Mcauliffe %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-liu19c %I PMLR %P 4023--4031 %U https://proceedings.mlr.press/v97/liu19c.html %V 97 %X We wish to compute the gradient of an expectation over a finite or countably infinite sample space having K $\leq$ $\infty$ categories. When K is indeed infinite, or finite but very large, the relevant summation is intractable. Accordingly, various stochastic gradient estimators have been proposed. In this paper, we describe a technique that can be applied to reduce the variance of any such estimator, without changing its bias{—}in particular, unbiasedness is retained. We show that our technique is an instance of Rao-Blackwellization, and we demonstrate the improvement it yields on a semi-supervised classification problem and a pixel attention task.
APA
Liu, R., Regier, J., Tripuraneni, N., Jordan, M. & Mcauliffe, J.. (2019). Rao-Blackwellized Stochastic Gradients for Discrete Distributions. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:4023-4031 Available from https://proceedings.mlr.press/v97/liu19c.html.

Related Material