Wide stochastic networks: Gaussian limit and PAC-Bayesian training

Eugenio Clerico, George Deligiannidis, Arnaud Doucet
Proceedings of The 34th International Conference on Algorithmic Learning Theory, PMLR 201:447-470, 2023.

Abstract

The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v201-clerico23a, title = {{Wide stochastic networks: Gaussian limit and PAC-Bayesian training}}, author = {Clerico, Eugenio and Deligiannidis, George and Doucet, Arnaud}, booktitle = {Proceedings of The 34th International Conference on Algorithmic Learning Theory}, pages = {447--470}, year = {2023}, editor = {Agrawal, Shipra and Orabona, Francesco}, volume = {201}, series = {Proceedings of Machine Learning Research}, month = {20 Feb--23 Feb}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v201/clerico23a/clerico23a.pdf}, url = {https://proceedings.mlr.press/v201/clerico23a.html}, abstract = {The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.} }
Endnote
%0 Conference Paper %T Wide stochastic networks: Gaussian limit and PAC-Bayesian training %A Eugenio Clerico %A George Deligiannidis %A Arnaud Doucet %B Proceedings of The 34th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Shipra Agrawal %E Francesco Orabona %F pmlr-v201-clerico23a %I PMLR %P 447--470 %U https://proceedings.mlr.press/v201/clerico23a.html %V 201 %X The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.
APA
Clerico, E., Deligiannidis, G. & Doucet, A.. (2023). Wide stochastic networks: Gaussian limit and PAC-Bayesian training. Proceedings of The 34th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 201:447-470 Available from https://proceedings.mlr.press/v201/clerico23a.html.

Related Material