Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:7524-7533, 2021.

Abstract

Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by a general descent algorithm which minimizes an $L$-statistic criterion over the model class, weighting small losses more. Our analysis characterizes the robustness of the method in terms of bounds on the reconstruction error relative to the underlying unperturbed distribution. As a byproduct, we prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning, a result which may be of independent interest. Numerical experiments with \textsc{kmeans} clustering and principal subspace analysis demonstrate the effectiveness of our approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-maurer21a, title = {Robust Unsupervised Learning via L-statistic Minimization}, author = {Maurer, Andreas and Parletta, Daniela Angela and Paudice, Andrea and Pontil, Massimiliano}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {7524--7533}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/maurer21a/maurer21a.pdf}, url = {https://proceedings.mlr.press/v139/maurer21a.html}, abstract = {Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by a general descent algorithm which minimizes an $L$-statistic criterion over the model class, weighting small losses more. Our analysis characterizes the robustness of the method in terms of bounds on the reconstruction error relative to the underlying unperturbed distribution. As a byproduct, we prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning, a result which may be of independent interest. Numerical experiments with \textsc{kmeans} clustering and principal subspace analysis demonstrate the effectiveness of our approach.} }
Endnote
%0 Conference Paper %T Robust Unsupervised Learning via L-statistic Minimization %A Andreas Maurer %A Daniela Angela Parletta %A Andrea Paudice %A Massimiliano Pontil %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-maurer21a %I PMLR %P 7524--7533 %U https://proceedings.mlr.press/v139/maurer21a.html %V 139 %X Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by a general descent algorithm which minimizes an $L$-statistic criterion over the model class, weighting small losses more. Our analysis characterizes the robustness of the method in terms of bounds on the reconstruction error relative to the underlying unperturbed distribution. As a byproduct, we prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning, a result which may be of independent interest. Numerical experiments with \textsc{kmeans} clustering and principal subspace analysis demonstrate the effectiveness of our approach.
APA
Maurer, A., Parletta, D.A., Paudice, A. & Pontil, M.. (2021). Robust Unsupervised Learning via L-statistic Minimization. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:7524-7533 Available from https://proceedings.mlr.press/v139/maurer21a.html.

Related Material