Position: A Theory of Deep Learning Must Include Compositional Sparsity

David A. Danhofer, Davide D’Ascenzo, Rafael Dubach, Tomaso A Poggio
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:81199-81210, 2025.

Abstract

Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. Completing the picture of the role of compositional sparsity in deep learning is essential to a comprehensive theory of artificial—and even general—intelligence.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-danhofer25a, title = {Position: A Theory of Deep Learning Must Include Compositional Sparsity}, author = {Danhofer, David A. and D'Ascenzo, Davide and Dubach, Rafael and Poggio, Tomaso A}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {81199--81210}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/danhofer25a/danhofer25a.pdf}, url = {https://proceedings.mlr.press/v267/danhofer25a.html}, abstract = {Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. Completing the picture of the role of compositional sparsity in deep learning is essential to a comprehensive theory of artificial—and even general—intelligence.} }
Endnote
%0 Conference Paper %T Position: A Theory of Deep Learning Must Include Compositional Sparsity %A David A. Danhofer %A Davide D’Ascenzo %A Rafael Dubach %A Tomaso A Poggio %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-danhofer25a %I PMLR %P 81199--81210 %U https://proceedings.mlr.press/v267/danhofer25a.html %V 267 %X Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. Completing the picture of the role of compositional sparsity in deep learning is essential to a comprehensive theory of artificial—and even general—intelligence.
APA
Danhofer, D.A., D’Ascenzo, D., Dubach, R. & Poggio, T.A.. (2025). Position: A Theory of Deep Learning Must Include Compositional Sparsity. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:81199-81210 Available from https://proceedings.mlr.press/v267/danhofer25a.html.

Related Material