Fundamental computational limits of weak learnability in high-dimensional multi-index models

Emanuele Troiani, Yatin Dandi, Leonardo Defilippis, Lenka Zdeborova, Bruno Loureiro, Florent Krzakala
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:2467-2475, 2025.

Abstract

Multi-index models - functions which only depend on the covariates through a non-linear transformation of their projection on a subspace - are a useful benchmark for investigating feature learning with neural networks. This paper examines the theoretical boundaries of efficient learnability in this hypothesis class, focusing particularly on the minimum sample complexity required for weakly recovering their low-dimensional structure with first-order iterative algorithms, in the high-dimensional regime where the number of samples is $n=\alpha d$ is proportional to the covariate dimension $d$. Our findings unfold in three parts: (i) first, we identify under which conditions a \textit{trivial subspace} can be learned with a single step of a first-order algorithm for any $\alpha>0$; (ii) second, in the case where the trivial subspace is empty, we provide necessary and sufficient conditions for the existence of an {\it easy subspace} consisting of directions that can be learned only above a certain sample complexity $\alpha>\alpha_c$. The critical threshold $\alpha_{c}$ marks the presence of a computational phase transition, in the sense that it is conjectured that no efficient iterative algorithm can succeed for $\alpha<\alpha_c$. In a limited but interesting set of really hard directions -akin to the parity problem- $\alpha_c$ is found to diverge. Finally, (iii) we demonstrate that interactions between different directions can result in an intricate hierarchical learning phenomenon, where some directions can be learned sequentially when coupled to easier ones. Our analytical approach is built on the optimality of approximate message-passing algorithms among first-order iterative methods, delineating the fundamental learnability limit across a broad spectrum of algorithms, including neural networks trained with gradient descent.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-troiani25a, title = {Fundamental computational limits of weak learnability in high-dimensional multi-index models}, author = {Troiani, Emanuele and Dandi, Yatin and Defilippis, Leonardo and Zdeborova, Lenka and Loureiro, Bruno and Krzakala, Florent}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {2467--2475}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/troiani25a/troiani25a.pdf}, url = {https://proceedings.mlr.press/v258/troiani25a.html}, abstract = {Multi-index models - functions which only depend on the covariates through a non-linear transformation of their projection on a subspace - are a useful benchmark for investigating feature learning with neural networks. This paper examines the theoretical boundaries of efficient learnability in this hypothesis class, focusing particularly on the minimum sample complexity required for weakly recovering their low-dimensional structure with first-order iterative algorithms, in the high-dimensional regime where the number of samples is $n=\alpha d$ is proportional to the covariate dimension $d$. Our findings unfold in three parts: (i) first, we identify under which conditions a \textit{trivial subspace} can be learned with a single step of a first-order algorithm for any $\alpha>0$; (ii) second, in the case where the trivial subspace is empty, we provide necessary and sufficient conditions for the existence of an {\it easy subspace} consisting of directions that can be learned only above a certain sample complexity $\alpha>\alpha_c$. The critical threshold $\alpha_{c}$ marks the presence of a computational phase transition, in the sense that it is conjectured that no efficient iterative algorithm can succeed for $\alpha<\alpha_c$. In a limited but interesting set of really hard directions -akin to the parity problem- $\alpha_c$ is found to diverge. Finally, (iii) we demonstrate that interactions between different directions can result in an intricate hierarchical learning phenomenon, where some directions can be learned sequentially when coupled to easier ones. Our analytical approach is built on the optimality of approximate message-passing algorithms among first-order iterative methods, delineating the fundamental learnability limit across a broad spectrum of algorithms, including neural networks trained with gradient descent.} }
Endnote
%0 Conference Paper %T Fundamental computational limits of weak learnability in high-dimensional multi-index models %A Emanuele Troiani %A Yatin Dandi %A Leonardo Defilippis %A Lenka Zdeborova %A Bruno Loureiro %A Florent Krzakala %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-troiani25a %I PMLR %P 2467--2475 %U https://proceedings.mlr.press/v258/troiani25a.html %V 258 %X Multi-index models - functions which only depend on the covariates through a non-linear transformation of their projection on a subspace - are a useful benchmark for investigating feature learning with neural networks. This paper examines the theoretical boundaries of efficient learnability in this hypothesis class, focusing particularly on the minimum sample complexity required for weakly recovering their low-dimensional structure with first-order iterative algorithms, in the high-dimensional regime where the number of samples is $n=\alpha d$ is proportional to the covariate dimension $d$. Our findings unfold in three parts: (i) first, we identify under which conditions a \textit{trivial subspace} can be learned with a single step of a first-order algorithm for any $\alpha>0$; (ii) second, in the case where the trivial subspace is empty, we provide necessary and sufficient conditions for the existence of an {\it easy subspace} consisting of directions that can be learned only above a certain sample complexity $\alpha>\alpha_c$. The critical threshold $\alpha_{c}$ marks the presence of a computational phase transition, in the sense that it is conjectured that no efficient iterative algorithm can succeed for $\alpha<\alpha_c$. In a limited but interesting set of really hard directions -akin to the parity problem- $\alpha_c$ is found to diverge. Finally, (iii) we demonstrate that interactions between different directions can result in an intricate hierarchical learning phenomenon, where some directions can be learned sequentially when coupled to easier ones. Our analytical approach is built on the optimality of approximate message-passing algorithms among first-order iterative methods, delineating the fundamental learnability limit across a broad spectrum of algorithms, including neural networks trained with gradient descent.
APA
Troiani, E., Dandi, Y., Defilippis, L., Zdeborova, L., Loureiro, B. & Krzakala, F.. (2025). Fundamental computational limits of weak learnability in high-dimensional multi-index models. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:2467-2475 Available from https://proceedings.mlr.press/v258/troiani25a.html.

Related Material