Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks

Adeel Pervez, Taco Cohen, Efstratios Gavves
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:7632-7640, 2020.

Abstract

Stochastic neural networks with discrete random variables are an important class of models for their expressiveness and interpretability. Since direct differentiation and backpropagation is not possible, Monte Carlo gradient estimation techniques are a popular alternative. Efficient stochastic gradient estimators, such Straight-Through and Gumbel-Softmax, work well for shallow stochastic models. Their performance, however, suffers with hierarchical, more complex models. We focus on stochastic networks with Boolean latent variables. To analyze such networks, we introduce the framework of harmonic analysis for Boolean functions to derive an analytic formulation for the bias and variance in the Straight-Through estimator. Exploiting these formulations, we propose \emph{FouST}, a low-bias and low-variance gradient estimation algorithm that is just as efficient. Extensive experiments show that FouST performs favorably compared to state-of-the-art biased estimators and is much faster than unbiased ones.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-pervez20a, title = {Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks}, author = {Pervez, Adeel and Cohen, Taco and Gavves, Efstratios}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {7632--7640}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/pervez20a/pervez20a.pdf}, url = {https://proceedings.mlr.press/v119/pervez20a.html}, abstract = {Stochastic neural networks with discrete random variables are an important class of models for their expressiveness and interpretability. Since direct differentiation and backpropagation is not possible, Monte Carlo gradient estimation techniques are a popular alternative. Efficient stochastic gradient estimators, such Straight-Through and Gumbel-Softmax, work well for shallow stochastic models. Their performance, however, suffers with hierarchical, more complex models. We focus on stochastic networks with Boolean latent variables. To analyze such networks, we introduce the framework of harmonic analysis for Boolean functions to derive an analytic formulation for the bias and variance in the Straight-Through estimator. Exploiting these formulations, we propose \emph{FouST}, a low-bias and low-variance gradient estimation algorithm that is just as efficient. Extensive experiments show that FouST performs favorably compared to state-of-the-art biased estimators and is much faster than unbiased ones.} }
Endnote
%0 Conference Paper %T Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks %A Adeel Pervez %A Taco Cohen %A Efstratios Gavves %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-pervez20a %I PMLR %P 7632--7640 %U https://proceedings.mlr.press/v119/pervez20a.html %V 119 %X Stochastic neural networks with discrete random variables are an important class of models for their expressiveness and interpretability. Since direct differentiation and backpropagation is not possible, Monte Carlo gradient estimation techniques are a popular alternative. Efficient stochastic gradient estimators, such Straight-Through and Gumbel-Softmax, work well for shallow stochastic models. Their performance, however, suffers with hierarchical, more complex models. We focus on stochastic networks with Boolean latent variables. To analyze such networks, we introduce the framework of harmonic analysis for Boolean functions to derive an analytic formulation for the bias and variance in the Straight-Through estimator. Exploiting these formulations, we propose \emph{FouST}, a low-bias and low-variance gradient estimation algorithm that is just as efficient. Extensive experiments show that FouST performs favorably compared to state-of-the-art biased estimators and is much faster than unbiased ones.
APA
Pervez, A., Cohen, T. & Gavves, E.. (2020). Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:7632-7640 Available from https://proceedings.mlr.press/v119/pervez20a.html.

Related Material