Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery

Filip Kovačević, Zhang Yihan, Marco Mondelli
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:3354-3404, 2025.

Abstract

Multi-index models provide a popular framework to investigate the learnability of functions with low-dimensional structure and, also due to their connections with neural networks, they have been object of recent intensive study. In this paper, we focus on recovering the subspace spanned by the signals via spectral estimators – a family of methods routinely used in practice, often as a warm-start for iterative algorithms. Our main technical contribution is a precise asymptotic characterization of the performance of spectral methods, when sample size and input dimension grow proportionally and the dimension $p$ of the space to recover is fixed. Specifically, we locate the top-$p$ eigenvalues of the spectral matrix and establish the overlaps between the corresponding eigenvectors (which give the spectral estimators) and a basis of the signal subspace. Our analysis unveils a phase transition phenomenon in which, as the sample complexity grows, eigenvalues escape from the bulk of the spectrum and, when that happens, eigenvectors recover directions of the desired subspace. The precise characterization we put forward enables the optimization of the data preprocessing, thus allowing to identify the spectral estimator that requires the minimal sample size for weak recovery.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-kovacevic25a, title = {Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery}, author = {Kova\v{c}evi\'c, Filip and Yihan, Zhang and Mondelli, Marco}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {3354--3404}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/kovacevic25a/kovacevic25a.pdf}, url = {https://proceedings.mlr.press/v291/kovacevic25a.html}, abstract = {Multi-index models provide a popular framework to investigate the learnability of functions with low-dimensional structure and, also due to their connections with neural networks, they have been object of recent intensive study. In this paper, we focus on recovering the subspace spanned by the signals via spectral estimators – a family of methods routinely used in practice, often as a warm-start for iterative algorithms. Our main technical contribution is a precise asymptotic characterization of the performance of spectral methods, when sample size and input dimension grow proportionally and the dimension $p$ of the space to recover is fixed. Specifically, we locate the top-$p$ eigenvalues of the spectral matrix and establish the overlaps between the corresponding eigenvectors (which give the spectral estimators) and a basis of the signal subspace. Our analysis unveils a phase transition phenomenon in which, as the sample complexity grows, eigenvalues escape from the bulk of the spectrum and, when that happens, eigenvectors recover directions of the desired subspace. The precise characterization we put forward enables the optimization of the data preprocessing, thus allowing to identify the spectral estimator that requires the minimal sample size for weak recovery.} }
Endnote
%0 Conference Paper %T Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery %A Filip Kovačević %A Zhang Yihan %A Marco Mondelli %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-kovacevic25a %I PMLR %P 3354--3404 %U https://proceedings.mlr.press/v291/kovacevic25a.html %V 291 %X Multi-index models provide a popular framework to investigate the learnability of functions with low-dimensional structure and, also due to their connections with neural networks, they have been object of recent intensive study. In this paper, we focus on recovering the subspace spanned by the signals via spectral estimators – a family of methods routinely used in practice, often as a warm-start for iterative algorithms. Our main technical contribution is a precise asymptotic characterization of the performance of spectral methods, when sample size and input dimension grow proportionally and the dimension $p$ of the space to recover is fixed. Specifically, we locate the top-$p$ eigenvalues of the spectral matrix and establish the overlaps between the corresponding eigenvectors (which give the spectral estimators) and a basis of the signal subspace. Our analysis unveils a phase transition phenomenon in which, as the sample complexity grows, eigenvalues escape from the bulk of the spectrum and, when that happens, eigenvectors recover directions of the desired subspace. The precise characterization we put forward enables the optimization of the data preprocessing, thus allowing to identify the spectral estimator that requires the minimal sample size for weak recovery.
APA
Kovačević, F., Yihan, Z. & Mondelli, M.. (2025). Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:3354-3404 Available from https://proceedings.mlr.press/v291/kovacevic25a.html.

Related Material