Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing (Extended Abstract)

Yihan Zhang, Hong Chang Ji, Ramji Venkataramanan, Marco Mondelli
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:5224-5230, 2024.

Abstract

We consider the problem of parameter estimation in a high-dimensional generalized linear model. Spectral methods obtained via the principal eigenvector of a suitable data-dependent matrix provide a simple yet surprisingly effective solution. However, despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured (i.i.d. Gaussian and Haar orthogonal) designs. In contrast, real-world data matrices are highly structured and exhibit non-trivial correlations. To address the problem, we consider correlated Gaussian designs capturing the anisotropic nature of the features via a covariance matrix $\Sigma$. Our main result is a precise asymptotic characterization of the performance of spectral estimators. This allows us to identify the optimal preprocessing that minimizes the number of samples needed for parameter estimation. Surprisingly, such preprocessing is universal across a broad set of statistical models, which partly addresses a conjecture on optimal spectral estimators for rotationally invariant designs. Our principled approach vastly improves upon previous heuristic methods, including for designs common in computational imaging and genetics. The proposed methodology, based on approximate message passing, is broadly applicable and opens the way to the precise characterization of spiked matrices and of the corresponding spectral methods in a variety of settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-zhang24c, title = {Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing (Extended Abstract)}, author = {Zhang, Yihan and Ji, Hong Chang and Venkataramanan, Ramji and Mondelli, Marco}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {5224--5230}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/zhang24c/zhang24c.pdf}, url = {https://proceedings.mlr.press/v247/zhang24c.html}, abstract = {We consider the problem of parameter estimation in a high-dimensional generalized linear model. Spectral methods obtained via the principal eigenvector of a suitable data-dependent matrix provide a simple yet surprisingly effective solution. However, despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured (i.i.d. Gaussian and Haar orthogonal) designs. In contrast, real-world data matrices are highly structured and exhibit non-trivial correlations. To address the problem, we consider correlated Gaussian designs capturing the anisotropic nature of the features via a covariance matrix $\Sigma$. Our main result is a precise asymptotic characterization of the performance of spectral estimators. This allows us to identify the optimal preprocessing that minimizes the number of samples needed for parameter estimation. Surprisingly, such preprocessing is universal across a broad set of statistical models, which partly addresses a conjecture on optimal spectral estimators for rotationally invariant designs. Our principled approach vastly improves upon previous heuristic methods, including for designs common in computational imaging and genetics. The proposed methodology, based on approximate message passing, is broadly applicable and opens the way to the precise characterization of spiked matrices and of the corresponding spectral methods in a variety of settings. } }
Endnote
%0 Conference Paper %T Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing (Extended Abstract) %A Yihan Zhang %A Hong Chang Ji %A Ramji Venkataramanan %A Marco Mondelli %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-zhang24c %I PMLR %P 5224--5230 %U https://proceedings.mlr.press/v247/zhang24c.html %V 247 %X We consider the problem of parameter estimation in a high-dimensional generalized linear model. Spectral methods obtained via the principal eigenvector of a suitable data-dependent matrix provide a simple yet surprisingly effective solution. However, despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured (i.i.d. Gaussian and Haar orthogonal) designs. In contrast, real-world data matrices are highly structured and exhibit non-trivial correlations. To address the problem, we consider correlated Gaussian designs capturing the anisotropic nature of the features via a covariance matrix $\Sigma$. Our main result is a precise asymptotic characterization of the performance of spectral estimators. This allows us to identify the optimal preprocessing that minimizes the number of samples needed for parameter estimation. Surprisingly, such preprocessing is universal across a broad set of statistical models, which partly addresses a conjecture on optimal spectral estimators for rotationally invariant designs. Our principled approach vastly improves upon previous heuristic methods, including for designs common in computational imaging and genetics. The proposed methodology, based on approximate message passing, is broadly applicable and opens the way to the precise characterization of spiked matrices and of the corresponding spectral methods in a variety of settings.
APA
Zhang, Y., Ji, H.C., Venkataramanan, R. & Mondelli, M.. (2024). Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing (Extended Abstract). Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:5224-5230 Available from https://proceedings.mlr.press/v247/zhang24c.html.

Related Material