Wide stochastic networks: Gaussian limit and PAC-Bayesian training

Eugenio Clerico; George Deligiannidis; Arnaud Doucet

Wide stochastic networks: Gaussian limit and PAC-Bayesian training

Eugenio Clerico, George Deligiannidis, Arnaud Doucet

Proceedings of The 34th International Conference on Algorithmic Learning Theory, PMLR 201:447-470, 2023.

Abstract

The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.

Cite this Paper

BibTeX

@InProceedings{pmlr-v201-clerico23a,
  title = 	 {{Wide stochastic networks: Gaussian limit and PAC-Bayesian training}},
  author =       {Clerico, Eugenio and Deligiannidis, George and Doucet, Arnaud},
  booktitle = 	 {Proceedings of The 34th International Conference on Algorithmic Learning Theory},
  pages = 	 {447--470},
  year = 	 {2023},
  editor = 	 {Agrawal, Shipra and Orabona, Francesco},
  volume = 	 {201},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20 Feb--23 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v201/clerico23a/clerico23a.pdf},
  url = 	 {https://proceedings.mlr.press/v201/clerico23a.html},
  abstract = 	 {The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.}
}

Endnote

%0 Conference Paper
%T Wide stochastic networks: Gaussian limit and PAC-Bayesian training
%A Eugenio Clerico
%A George Deligiannidis
%A Arnaud Doucet
%B Proceedings of The 34th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2023
%E Shipra Agrawal
%E Francesco Orabona	
%F pmlr-v201-clerico23a
%I PMLR
%P 447--470
%U https://proceedings.mlr.press/v201/clerico23a.html
%V 201
%X The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.

APA

Clerico, E., Deligiannidis, G. & Doucet, A.. (2023). Wide stochastic networks: Gaussian limit and PAC-Bayesian training. Proceedings of The 34th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 201:447-470 Available from https://proceedings.mlr.press/v201/clerico23a.html.

Related Material

Download PDF