On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization

Amir Joudaki, Hadi Daneshmand, Francis Bach
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:15388-15400, 2023.

Abstract

Mean-field theory is widely used in theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions for Gram matrices of hidden representations in deep multilayer perceptron (MLP) with batch normalization (BN) at initialization. It is postulated that the mean-field predictions suffer from layer-wise errors that amplify with depth. We demonstrate that BN avoids this error amplification with depth. When the chain of hidden representations is rapidly mixing, we establish a concentration bound for a mean-field model of Gram matrices. To our knowledge, this is the first concentration bound that does not become vacuous with depth for standard MLPs with a finite width.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-joudaki23a, title = {On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization}, author = {Joudaki, Amir and Daneshmand, Hadi and Bach, Francis}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {15388--15400}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/joudaki23a/joudaki23a.pdf}, url = {https://proceedings.mlr.press/v202/joudaki23a.html}, abstract = {Mean-field theory is widely used in theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions for Gram matrices of hidden representations in deep multilayer perceptron (MLP) with batch normalization (BN) at initialization. It is postulated that the mean-field predictions suffer from layer-wise errors that amplify with depth. We demonstrate that BN avoids this error amplification with depth. When the chain of hidden representations is rapidly mixing, we establish a concentration bound for a mean-field model of Gram matrices. To our knowledge, this is the first concentration bound that does not become vacuous with depth for standard MLPs with a finite width.} }
Endnote
%0 Conference Paper %T On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization %A Amir Joudaki %A Hadi Daneshmand %A Francis Bach %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-joudaki23a %I PMLR %P 15388--15400 %U https://proceedings.mlr.press/v202/joudaki23a.html %V 202 %X Mean-field theory is widely used in theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions for Gram matrices of hidden representations in deep multilayer perceptron (MLP) with batch normalization (BN) at initialization. It is postulated that the mean-field predictions suffer from layer-wise errors that amplify with depth. We demonstrate that BN avoids this error amplification with depth. When the chain of hidden representations is rapidly mixing, we establish a concentration bound for a mean-field model of Gram matrices. To our knowledge, this is the first concentration bound that does not become vacuous with depth for standard MLPs with a finite width.
APA
Joudaki, A., Daneshmand, H. & Bach, F.. (2023). On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:15388-15400 Available from https://proceedings.mlr.press/v202/joudaki23a.html.

Related Material