Narrow Margins: Classification, Margins and Fat Tails

Francois Buet-Golfouse
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1127-1135, 2021.

Abstract

It is well-known that, for separable data, the regularised two-class logistic regression or support vector machine re-normalised estimate converges to the maximal margin classifier as the regularisation hyper-parameter $\lambda$ goes to 0. The fact that different loss functions may lead to the same solution is of theoretical and practical relevance as margin maximisation allows more straightforward considerations in terms of generalisation and geometric interpretation. We investigate the case where this convergence property is not guaranteed to hold and show that it can be fully characterised by the distribution of error terms in the latent variable interpretation of linear classifiers. In particular, if errors follow a regularly varying distribution, then the regularised and re-normalised estimate does not converge to the maximal margin classifier. This shows that classification with fat tails has a qualitatively different behaviour, which should be taken into account when considering real-life data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-buet-golfouse21a, title = {Narrow Margins: Classification, Margins and Fat Tails}, author = {Buet-Golfouse, Francois}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {1127--1135}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/buet-golfouse21a/buet-golfouse21a.pdf}, url = {https://proceedings.mlr.press/v139/buet-golfouse21a.html}, abstract = {It is well-known that, for separable data, the regularised two-class logistic regression or support vector machine re-normalised estimate converges to the maximal margin classifier as the regularisation hyper-parameter $\lambda$ goes to 0. The fact that different loss functions may lead to the same solution is of theoretical and practical relevance as margin maximisation allows more straightforward considerations in terms of generalisation and geometric interpretation. We investigate the case where this convergence property is not guaranteed to hold and show that it can be fully characterised by the distribution of error terms in the latent variable interpretation of linear classifiers. In particular, if errors follow a regularly varying distribution, then the regularised and re-normalised estimate does not converge to the maximal margin classifier. This shows that classification with fat tails has a qualitatively different behaviour, which should be taken into account when considering real-life data.} }
Endnote
%0 Conference Paper %T Narrow Margins: Classification, Margins and Fat Tails %A Francois Buet-Golfouse %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-buet-golfouse21a %I PMLR %P 1127--1135 %U https://proceedings.mlr.press/v139/buet-golfouse21a.html %V 139 %X It is well-known that, for separable data, the regularised two-class logistic regression or support vector machine re-normalised estimate converges to the maximal margin classifier as the regularisation hyper-parameter $\lambda$ goes to 0. The fact that different loss functions may lead to the same solution is of theoretical and practical relevance as margin maximisation allows more straightforward considerations in terms of generalisation and geometric interpretation. We investigate the case where this convergence property is not guaranteed to hold and show that it can be fully characterised by the distribution of error terms in the latent variable interpretation of linear classifiers. In particular, if errors follow a regularly varying distribution, then the regularised and re-normalised estimate does not converge to the maximal margin classifier. This shows that classification with fat tails has a qualitatively different behaviour, which should be taken into account when considering real-life data.
APA
Buet-Golfouse, F.. (2021). Narrow Margins: Classification, Margins and Fat Tails. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1127-1135 Available from https://proceedings.mlr.press/v139/buet-golfouse21a.html.

Related Material