Stochastic Weight Sharing for Bayesian Neural Networks

Moule Lin, Shuhao Guan, Weipeng Jing, Goetz Botterweck, Andrea Patane
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:4519-4527, 2025.

Abstract

While offering a principled framework for uncertainty quantification in deep learning, the employment of Bayesian Neural Networks (BNNs) is still constrained by their increased computational requirements and the convergence difficulties when training very deep, state-of-the-art architectures. In this work, We reinterpret weight-sharing quantization techniques from a stochastic perspective in the context of training and inference with Bayesian Neural Networks (BNNs). Specifically, we leverage 2D-adaptive Gaussian distributions, Wasserstein distance estimations, and alpha-blending to encode the stochastic behavior of a BNN in a lower-dimensional, soft Gaussian representation. Through extensive empirical investigation, we demonstrate that our approach significantly reduces the computational overhead inherent in Bayesian learning by several orders of magnitude, enabling efficient Bayesian training of large-scale models, such as ResNet-101 and Vision Transformer (VIT). On various computer vision benchmarks—including CIFAR-10, CIFAR-100, and ImageNet1k—our approach compresses model parameters by approximately 50$\times$ and reduces model size by 75% while achieving accuracy and uncertainty estimations comparable to state-of-the-art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-lin25a, title = {Stochastic Weight Sharing for Bayesian Neural Networks}, author = {Lin, Moule and Guan, Shuhao and Jing, Weipeng and Botterweck, Goetz and Patane, Andrea}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {4519--4527}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/lin25a/lin25a.pdf}, url = {https://proceedings.mlr.press/v258/lin25a.html}, abstract = {While offering a principled framework for uncertainty quantification in deep learning, the employment of Bayesian Neural Networks (BNNs) is still constrained by their increased computational requirements and the convergence difficulties when training very deep, state-of-the-art architectures. In this work, We reinterpret weight-sharing quantization techniques from a stochastic perspective in the context of training and inference with Bayesian Neural Networks (BNNs). Specifically, we leverage 2D-adaptive Gaussian distributions, Wasserstein distance estimations, and alpha-blending to encode the stochastic behavior of a BNN in a lower-dimensional, soft Gaussian representation. Through extensive empirical investigation, we demonstrate that our approach significantly reduces the computational overhead inherent in Bayesian learning by several orders of magnitude, enabling efficient Bayesian training of large-scale models, such as ResNet-101 and Vision Transformer (VIT). On various computer vision benchmarks—including CIFAR-10, CIFAR-100, and ImageNet1k—our approach compresses model parameters by approximately 50$\times$ and reduces model size by 75% while achieving accuracy and uncertainty estimations comparable to state-of-the-art.} }
Endnote
%0 Conference Paper %T Stochastic Weight Sharing for Bayesian Neural Networks %A Moule Lin %A Shuhao Guan %A Weipeng Jing %A Goetz Botterweck %A Andrea Patane %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-lin25a %I PMLR %P 4519--4527 %U https://proceedings.mlr.press/v258/lin25a.html %V 258 %X While offering a principled framework for uncertainty quantification in deep learning, the employment of Bayesian Neural Networks (BNNs) is still constrained by their increased computational requirements and the convergence difficulties when training very deep, state-of-the-art architectures. In this work, We reinterpret weight-sharing quantization techniques from a stochastic perspective in the context of training and inference with Bayesian Neural Networks (BNNs). Specifically, we leverage 2D-adaptive Gaussian distributions, Wasserstein distance estimations, and alpha-blending to encode the stochastic behavior of a BNN in a lower-dimensional, soft Gaussian representation. Through extensive empirical investigation, we demonstrate that our approach significantly reduces the computational overhead inherent in Bayesian learning by several orders of magnitude, enabling efficient Bayesian training of large-scale models, such as ResNet-101 and Vision Transformer (VIT). On various computer vision benchmarks—including CIFAR-10, CIFAR-100, and ImageNet1k—our approach compresses model parameters by approximately 50$\times$ and reduces model size by 75% while achieving accuracy and uncertainty estimations comparable to state-of-the-art.
APA
Lin, M., Guan, S., Jing, W., Botterweck, G. & Patane, A.. (2025). Stochastic Weight Sharing for Bayesian Neural Networks. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:4519-4527 Available from https://proceedings.mlr.press/v258/lin25a.html.

Related Material