Tight bounds for maximum 1-margin classifiers

Stefan Stojanovic, Konstantin Donhauser, Fanny Yang
Proceedings of The 35th International Conference on Algorithmic Learning Theory, PMLR 237:1055-1112, 2024.

Abstract

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum 1-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the 1-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum 1-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order \frac{1}{\sqrt{\log(d/n)}}. We are therefore first to show benign overfitting for the maximum \ell_1-margin classifier.

Cite this Paper


BibTeX
@InProceedings{pmlr-v237-stojanovic24a, title = {Tight bounds for maximum $\ell_1$-margin classifiers}, author = {Stojanovic, Stefan and Donhauser, Konstantin and Yang, Fanny}, booktitle = {Proceedings of The 35th International Conference on Algorithmic Learning Theory}, pages = {1055--1112}, year = {2024}, editor = {Vernade, Claire and Hsu, Daniel}, volume = {237}, series = {Proceedings of Machine Learning Research}, month = {25--28 Feb}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v237/stojanovic24a/stojanovic24a.pdf}, url = {https://proceedings.mlr.press/v237/stojanovic24a.html}, abstract = {Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.} }
Endnote
%0 Conference Paper %T Tight bounds for maximum $\ell_1$-margin classifiers %A Stefan Stojanovic %A Konstantin Donhauser %A Fanny Yang %B Proceedings of The 35th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Claire Vernade %E Daniel Hsu %F pmlr-v237-stojanovic24a %I PMLR %P 1055--1112 %U https://proceedings.mlr.press/v237/stojanovic24a.html %V 237 %X Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.
APA
Stojanovic, S., Donhauser, K. & Yang, F.. (2024). Tight bounds for maximum $\ell_1$-margin classifiers. Proceedings of The 35th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 237:1055-1112 Available from https://proceedings.mlr.press/v237/stojanovic24a.html.

Related Material