Layered Sampling for Robust Optimization Problems

Hu Ding, Zixiu Wang
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:2556-2566, 2020.

Abstract

In real world, our datasets often contain outliers. Most existing algorithms for handling outliers take high time complexities (\emph{e.g.} quadratic or cubic complexity). \emph{Coreset} is a popular approach for compressing data so as to speed up the optimization algorithms. However, the current coreset methods cannot be easily extended to handle the case with outliers. In this paper, we propose a new variant of coreset technique, \emph{layered sampling}, to deal with two fundamental robust optimization problems: \emph{$k$-median/means clustering with outliers} and \emph{linear regression with outliers}. This new coreset method is in particular suitable to speed up the iterative algorithms (which often improve the solution within a local range) for those robust optimization problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-ding20c, title = {Layered Sampling for Robust Optimization Problems}, author = {Ding, Hu and Wang, Zixiu}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {2556--2566}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/ding20c/ding20c.pdf}, url = {http://proceedings.mlr.press/v119/ding20c.html}, abstract = {In real world, our datasets often contain outliers. Most existing algorithms for handling outliers take high time complexities (\emph{e.g.} quadratic or cubic complexity). \emph{Coreset} is a popular approach for compressing data so as to speed up the optimization algorithms. However, the current coreset methods cannot be easily extended to handle the case with outliers. In this paper, we propose a new variant of coreset technique, \emph{layered sampling}, to deal with two fundamental robust optimization problems: \emph{$k$-median/means clustering with outliers} and \emph{linear regression with outliers}. This new coreset method is in particular suitable to speed up the iterative algorithms (which often improve the solution within a local range) for those robust optimization problems.} }
Endnote
%0 Conference Paper %T Layered Sampling for Robust Optimization Problems %A Hu Ding %A Zixiu Wang %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-ding20c %I PMLR %P 2556--2566 %U http://proceedings.mlr.press/v119/ding20c.html %V 119 %X In real world, our datasets often contain outliers. Most existing algorithms for handling outliers take high time complexities (\emph{e.g.} quadratic or cubic complexity). \emph{Coreset} is a popular approach for compressing data so as to speed up the optimization algorithms. However, the current coreset methods cannot be easily extended to handle the case with outliers. In this paper, we propose a new variant of coreset technique, \emph{layered sampling}, to deal with two fundamental robust optimization problems: \emph{$k$-median/means clustering with outliers} and \emph{linear regression with outliers}. This new coreset method is in particular suitable to speed up the iterative algorithms (which often improve the solution within a local range) for those robust optimization problems.
APA
Ding, H. & Wang, Z.. (2020). Layered Sampling for Robust Optimization Problems. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:2556-2566 Available from http://proceedings.mlr.press/v119/ding20c.html.

Related Material