Tight bounds for maximum $\ell_1$-margin classifiers

Stefan Stojanovic; Konstantin Donhauser; Fanny Yang

Tight bounds for maximum $\ell_1$ -margin classifiers

Stefan Stojanovic, Konstantin Donhauser, Fanny Yang

Proceedings of The 35th International Conference on Algorithmic Learning Theory, PMLR 237:1055-1112, 2024.

Abstract

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum

$\ell_1$ -margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the

$\ell_1$ -norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum

$\ell_1$ -margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order

$\frac{\|\w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order

$\frac{1}{\sqrt{\log(d/n)}}$ . We are therefore first to show benign overfitting for the maximum

$\ell_1$ -margin classifier.

Cite this Paper

BibTeX


@InProceedings{pmlr-v237-stojanovic24a,
  title = 	 {Tight bounds for maximum $\ell_1$-margin classifiers},
  author =       {Stojanovic, Stefan and Donhauser, Konstantin and Yang, Fanny},
  booktitle = 	 {Proceedings of The 35th International Conference on Algorithmic Learning Theory},
  pages = 	 {1055--1112},
  year = 	 {2024},
  editor = 	 {Vernade, Claire and Hsu, Daniel},
  volume = 	 {237},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--28 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v237/stojanovic24a/stojanovic24a.pdf},
  url = 	 {https://proceedings.mlr.press/v237/stojanovic24a.html},
  abstract = 	 {Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.}
}

Endnote

%0 Conference Paper
%T Tight bounds for maximum $\ell_1$-margin classifiers
%A Stefan Stojanovic
%A Konstantin Donhauser
%A Fanny Yang
%B Proceedings of The 35th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2024
%E Claire Vernade
%E Daniel Hsu	
%F pmlr-v237-stojanovic24a
%I PMLR
%P 1055--1112
%U https://proceedings.mlr.press/v237/stojanovic24a.html
%V 237
%X Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.

APA


Stojanovic, S., Donhauser, K. & Yang, F.. (2024). Tight bounds for maximum $\ell_1$-margin classifiers. Proceedings of The 35th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 237:1055-1112 Available from https://proceedings.mlr.press/v237/stojanovic24a.html.

Related Material

Download PDF

Tight bounds for maximum ℓ1\ell_1-margin classifiers

Abstract

Cite this Paper

Related Material

Tight bounds for maximum $\ell_1$ -margin classifiers