Learning a Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent

Guillaume Braun, Minh Ha Quang, Masaaki Imaizumi
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:1216-1224, 2025.

Abstract

We investigate the problem of learning a Single Index Model (SIM)—a popular model for studying the ability of neural networks to learn features—from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). While the isotropic case has been extensively studied, the anisotropic case has received less attention and the impact of the covariance matrix on the learning dynamics remains unclear. For instance, Mousavi-Hosseini et al. (2023b) proposed a spherical SGD that requires a separate estimation of the data covariance matrix, thereby oversimplifying the influence of covariance. In this study, we analyze the learning dynamics of vanilla SGD under the SIM with anisotropic input data, demonstrating that vanilla SGD automatically adapts to the data’s covariance structure. Leveraging these results, we derive upper and lower bounds on the sample complexity using a notion of effective dimension that is determined by the structure of the covariance matrix instead of the input data dimension. Finally, we validate and extend our theoretical findings through numerical simulations, demonstrating the practical effectiveness of our approach in adapting to anisotropic data, which has implications for efficient training of neural networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-braun25a, title = {Learning a Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent}, author = {Braun, Guillaume and Quang, Minh Ha and Imaizumi, Masaaki}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {1216--1224}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/braun25a/braun25a.pdf}, url = {https://proceedings.mlr.press/v258/braun25a.html}, abstract = {We investigate the problem of learning a Single Index Model (SIM)—a popular model for studying the ability of neural networks to learn features—from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). While the isotropic case has been extensively studied, the anisotropic case has received less attention and the impact of the covariance matrix on the learning dynamics remains unclear. For instance, Mousavi-Hosseini et al. (2023b) proposed a spherical SGD that requires a separate estimation of the data covariance matrix, thereby oversimplifying the influence of covariance. In this study, we analyze the learning dynamics of vanilla SGD under the SIM with anisotropic input data, demonstrating that vanilla SGD automatically adapts to the data’s covariance structure. Leveraging these results, we derive upper and lower bounds on the sample complexity using a notion of effective dimension that is determined by the structure of the covariance matrix instead of the input data dimension. Finally, we validate and extend our theoretical findings through numerical simulations, demonstrating the practical effectiveness of our approach in adapting to anisotropic data, which has implications for efficient training of neural networks.} }
Endnote
%0 Conference Paper %T Learning a Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent %A Guillaume Braun %A Minh Ha Quang %A Masaaki Imaizumi %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-braun25a %I PMLR %P 1216--1224 %U https://proceedings.mlr.press/v258/braun25a.html %V 258 %X We investigate the problem of learning a Single Index Model (SIM)—a popular model for studying the ability of neural networks to learn features—from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). While the isotropic case has been extensively studied, the anisotropic case has received less attention and the impact of the covariance matrix on the learning dynamics remains unclear. For instance, Mousavi-Hosseini et al. (2023b) proposed a spherical SGD that requires a separate estimation of the data covariance matrix, thereby oversimplifying the influence of covariance. In this study, we analyze the learning dynamics of vanilla SGD under the SIM with anisotropic input data, demonstrating that vanilla SGD automatically adapts to the data’s covariance structure. Leveraging these results, we derive upper and lower bounds on the sample complexity using a notion of effective dimension that is determined by the structure of the covariance matrix instead of the input data dimension. Finally, we validate and extend our theoretical findings through numerical simulations, demonstrating the practical effectiveness of our approach in adapting to anisotropic data, which has implications for efficient training of neural networks.
APA
Braun, G., Quang, M.H. & Imaizumi, M.. (2025). Learning a Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:1216-1224 Available from https://proceedings.mlr.press/v258/braun25a.html.

Related Material