On the sample complexity of parameter estimation in logistic regression with normal design

Daniel Hsu, Arya Mazumdar
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:2418-2437, 2024.

Abstract

The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given $\ell_2$ error, in terms of the dimension and the inverse temperature, with standard normal covariates. The inverse temperature controls the signal-to-noise ratio of the data generation process. While both generalization bounds and asymptotic performance of the maximum-likelihood estimator for logistic regression are well-studied, the non-asymptotic sample complexity that shows the dependence on error and the inverse temperature for parameter estimation is absent from previous analyses. We show that the sample complexity curve has two change-points in terms of the inverse temperature, clearly separating the low, moderate, and high temperature regimes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-hsu24a, title = {On the sample complexity of parameter estimation in logistic regression with normal design}, author = {Hsu, Daniel and Mazumdar, Arya}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {2418--2437}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/hsu24a/hsu24a.pdf}, url = {https://proceedings.mlr.press/v247/hsu24a.html}, abstract = {The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given $\ell_2$ error, in terms of the dimension and the inverse temperature, with standard normal covariates. The inverse temperature controls the signal-to-noise ratio of the data generation process. While both generalization bounds and asymptotic performance of the maximum-likelihood estimator for logistic regression are well-studied, the non-asymptotic sample complexity that shows the dependence on error and the inverse temperature for parameter estimation is absent from previous analyses. We show that the sample complexity curve has two change-points in terms of the inverse temperature, clearly separating the low, moderate, and high temperature regimes.} }
Endnote
%0 Conference Paper %T On the sample complexity of parameter estimation in logistic regression with normal design %A Daniel Hsu %A Arya Mazumdar %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-hsu24a %I PMLR %P 2418--2437 %U https://proceedings.mlr.press/v247/hsu24a.html %V 247 %X The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given $\ell_2$ error, in terms of the dimension and the inverse temperature, with standard normal covariates. The inverse temperature controls the signal-to-noise ratio of the data generation process. While both generalization bounds and asymptotic performance of the maximum-likelihood estimator for logistic regression are well-studied, the non-asymptotic sample complexity that shows the dependence on error and the inverse temperature for parameter estimation is absent from previous analyses. We show that the sample complexity curve has two change-points in terms of the inverse temperature, clearly separating the low, moderate, and high temperature regimes.
APA
Hsu, D. & Mazumdar, A.. (2024). On the sample complexity of parameter estimation in logistic regression with normal design. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:2418-2437 Available from https://proceedings.mlr.press/v247/hsu24a.html.

Related Material