On Generalization Bounds for Neural Networks with Low Rank Layers

Andrea Pinto, Akshay Rangamani, Tomaso A Poggio
Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:921-936, 2025.

Abstract

While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain underexplored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep networks exhibiting neural collapse.

Cite this Paper


BibTeX
@InProceedings{pmlr-v272-pinto25a, title = {On Generalization Bounds for Neural Networks with Low Rank Layers}, author = {Pinto, Andrea and Rangamani, Akshay and Poggio, Tomaso A}, booktitle = {Proceedings of The 36th International Conference on Algorithmic Learning Theory}, pages = {921--936}, year = {2025}, editor = {Kamath, Gautam and Loh, Po-Ling}, volume = {272}, series = {Proceedings of Machine Learning Research}, month = {24--27 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v272/main/assets/pinto25a/pinto25a.pdf}, url = {https://proceedings.mlr.press/v272/pinto25a.html}, abstract = {While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain underexplored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep networks exhibiting neural collapse.} }
Endnote
%0 Conference Paper %T On Generalization Bounds for Neural Networks with Low Rank Layers %A Andrea Pinto %A Akshay Rangamani %A Tomaso A Poggio %B Proceedings of The 36th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Gautam Kamath %E Po-Ling Loh %F pmlr-v272-pinto25a %I PMLR %P 921--936 %U https://proceedings.mlr.press/v272/pinto25a.html %V 272 %X While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain underexplored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep networks exhibiting neural collapse.
APA
Pinto, A., Rangamani, A. & Poggio, T.A.. (2025). On Generalization Bounds for Neural Networks with Low Rank Layers. Proceedings of The 36th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 272:921-936 Available from https://proceedings.mlr.press/v272/pinto25a.html.

Related Material