Graph reparameterizations for enabling 1000+ Monte Carlo iterations in Bayesian deep neural networks

Jurijs Nazarovs, Ronak R. Mehta, Vishnu Suresh Lokhande, Vikas Singh
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:118-128, 2021.

Abstract

Uncertainty estimation in deep models is essential in many real-world applications and has benefited from developments over the last several years. Recent evidence suggests that existing solutions dependent on simple Gaussian formulations may not be sufficient. However, moving to other distributions necessitates Monte Carlo (MC) sampling to estimate quantities such as the KL divergence: it could be expensive and scales poorly as the dimensions of both the input data and the model grow. This is directly related to the structure of the computation graph, which can grow linearly as a function of the number of MC samples needed. Here, we construct a framework to describe these computation graphs, and identify probability families where the graph size can be independent or only weakly dependent on the number of MC samples. These families correspond directly to large classes of distributions. Empirically, we can run a much larger number of iterations for MC approximations for larger architectures used in computer vision with gains in performance measured in confident accuracy, stability of training, memory and training time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v161-nazarovs21b, title = {Graph reparameterizations for enabling 1000+ Monte Carlo iterations in Bayesian deep neural networks}, author = {Nazarovs, Jurijs and Mehta, Ronak R. and Lokhande, Vishnu Suresh and Singh, Vikas}, booktitle = {Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence}, pages = {118--128}, year = {2021}, editor = {de Campos, Cassio and Maathuis, Marloes H.}, volume = {161}, series = {Proceedings of Machine Learning Research}, month = {27--30 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v161/nazarovs21b/nazarovs21b.pdf}, url = {https://proceedings.mlr.press/v161/nazarovs21b.html}, abstract = {Uncertainty estimation in deep models is essential in many real-world applications and has benefited from developments over the last several years. Recent evidence suggests that existing solutions dependent on simple Gaussian formulations may not be sufficient. However, moving to other distributions necessitates Monte Carlo (MC) sampling to estimate quantities such as the KL divergence: it could be expensive and scales poorly as the dimensions of both the input data and the model grow. This is directly related to the structure of the computation graph, which can grow linearly as a function of the number of MC samples needed. Here, we construct a framework to describe these computation graphs, and identify probability families where the graph size can be independent or only weakly dependent on the number of MC samples. These families correspond directly to large classes of distributions. Empirically, we can run a much larger number of iterations for MC approximations for larger architectures used in computer vision with gains in performance measured in confident accuracy, stability of training, memory and training time.} }
Endnote
%0 Conference Paper %T Graph reparameterizations for enabling 1000+ Monte Carlo iterations in Bayesian deep neural networks %A Jurijs Nazarovs %A Ronak R. Mehta %A Vishnu Suresh Lokhande %A Vikas Singh %B Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2021 %E Cassio de Campos %E Marloes H. Maathuis %F pmlr-v161-nazarovs21b %I PMLR %P 118--128 %U https://proceedings.mlr.press/v161/nazarovs21b.html %V 161 %X Uncertainty estimation in deep models is essential in many real-world applications and has benefited from developments over the last several years. Recent evidence suggests that existing solutions dependent on simple Gaussian formulations may not be sufficient. However, moving to other distributions necessitates Monte Carlo (MC) sampling to estimate quantities such as the KL divergence: it could be expensive and scales poorly as the dimensions of both the input data and the model grow. This is directly related to the structure of the computation graph, which can grow linearly as a function of the number of MC samples needed. Here, we construct a framework to describe these computation graphs, and identify probability families where the graph size can be independent or only weakly dependent on the number of MC samples. These families correspond directly to large classes of distributions. Empirically, we can run a much larger number of iterations for MC approximations for larger architectures used in computer vision with gains in performance measured in confident accuracy, stability of training, memory and training time.
APA
Nazarovs, J., Mehta, R.R., Lokhande, V.S. & Singh, V.. (2021). Graph reparameterizations for enabling 1000+ Monte Carlo iterations in Bayesian deep neural networks. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 161:118-128 Available from https://proceedings.mlr.press/v161/nazarovs21b.html.

Related Material