Sharp Asymptotics and Optimal Performance for Inference in Binary Models

Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3739-3749, 2020.

Abstract

We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary models (such as Signed, Logistic or Probit), by constructing appropriate loss functions that achieve it. More interestingly, for binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one. Numerical simulations corroborate our theoretical findings and suggest they are accurate even for relatively small problem dimensions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-taheri20a, title = {Sharp Asymptotics and Optimal Performance for Inference in Binary Models}, author = {Taheri, Hossein and Pedarsani, Ramtin and Thrampoulidis, Christos}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3739--3749}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/taheri20a/taheri20a.pdf}, url = { http://proceedings.mlr.press/v108/taheri20a.html }, abstract = {We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary models (such as Signed, Logistic or Probit), by constructing appropriate loss functions that achieve it. More interestingly, for binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one. Numerical simulations corroborate our theoretical findings and suggest they are accurate even for relatively small problem dimensions. } }
Endnote
%0 Conference Paper %T Sharp Asymptotics and Optimal Performance for Inference in Binary Models %A Hossein Taheri %A Ramtin Pedarsani %A Christos Thrampoulidis %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-taheri20a %I PMLR %P 3739--3749 %U http://proceedings.mlr.press/v108/taheri20a.html %V 108 %X We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary models (such as Signed, Logistic or Probit), by constructing appropriate loss functions that achieve it. More interestingly, for binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one. Numerical simulations corroborate our theoretical findings and suggest they are accurate even for relatively small problem dimensions.
APA
Taheri, H., Pedarsani, R. & Thrampoulidis, C.. (2020). Sharp Asymptotics and Optimal Performance for Inference in Binary Models. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3739-3749 Available from http://proceedings.mlr.press/v108/taheri20a.html .

Related Material