Training Binary Neural Networks using the Bayesian Learning Rule

Xiangming Meng; Roman Bachmann; Mohammad Emtiyaz Khan

Training Binary Neural Networks using the Bayesian Learning Rule

Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:6852-6861, 2020.

Abstract

Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation and continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which also justifies and extends existing approaches.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-meng20a,
  title = 	 {Training Binary Neural Networks using the {B}ayesian Learning Rule},
  author =       {Meng, Xiangming and Bachmann, Roman and Khan, Mohammad Emtiyaz},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {6852--6861},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/meng20a/meng20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/meng20a.html},
  abstract = 	 {Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation and continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which also justifies and extends existing approaches.}
}

Endnote

%0 Conference Paper
%T Training Binary Neural Networks using the Bayesian Learning Rule
%A Xiangming Meng
%A Roman Bachmann
%A Mohammad Emtiyaz Khan
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-meng20a
%I PMLR
%P 6852--6861
%U https://proceedings.mlr.press/v119/meng20a.html
%V 119
%X Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation and continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which also justifies and extends existing approaches.

APA


Meng, X., Bachmann, R. & Khan, M.E.. (2020). Training Binary Neural Networks using the Bayesian Learning Rule. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:6852-6861 Available from https://proceedings.mlr.press/v119/meng20a.html.

Training Binary Neural Networks using the Bayesian Learning Rule

Abstract

Cite this Paper

Related Material