Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Jingfeng Wu; Difan Zou; Zixiang Chen; Vladimir Braverman; Quanquan Gu; Sham M. Kakade

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Jingfeng Wu, Difan Zou, Zixiang Chen, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:37919-37951, 2023.

Abstract

This paper considers the problem of learning single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron [Kakade et al. 2011], and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we also provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be more preferable than SGD for high-dimensional ReLU regression.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-wu23ab,
  title = 	 {Finite-Sample Analysis of Learning High-Dimensional Single {R}e{LU} Neuron},
  author =       {Wu, Jingfeng and Zou, Difan and Chen, Zixiang and Braverman, Vladimir and Gu, Quanquan and Kakade, Sham M.},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {37919--37951},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/wu23ab/wu23ab.pdf},
  url = 	 {https://proceedings.mlr.press/v202/wu23ab.html},
  abstract = 	 {This paper considers the problem of learning single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron [Kakade et al. 2011], and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we also provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be more preferable than SGD for high-dimensional ReLU regression.}
}

Endnote

%0 Conference Paper
%T Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
%A Jingfeng Wu
%A Difan Zou
%A Zixiang Chen
%A Vladimir Braverman
%A Quanquan Gu
%A Sham M. Kakade
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-wu23ab
%I PMLR
%P 37919--37951
%U https://proceedings.mlr.press/v202/wu23ab.html
%V 202
%X This paper considers the problem of learning single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron [Kakade et al. 2011], and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we also provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be more preferable than SGD for high-dimensional ReLU regression.

APA


Wu, J., Zou, D., Chen, Z., Braverman, V., Gu, Q. & Kakade, S.M.. (2023). Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:37919-37951 Available from https://proceedings.mlr.press/v202/wu23ab.html.

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Abstract

Cite this Paper

Related Material