Principled learning method for Wasserstein distributionally robust optimization with local perturbations

Yongchan Kwon, Wonyoung Kim, Joong-Ho Won, Myunghee Cho Paik
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:5567-5576, 2020.

Abstract

Wasserstein distributionally robust optimization (WDRO) attempts to learn a model that minimizes the local worst-case risk in the vicinity of the empirical data distribution defined by Wasserstein ball. While WDRO has received attention as a promising tool for inference since its introduction, its theoretical understanding has not been fully matured. Gao et al. (2017) proposed a minimizer based on a tractable approximation of the local worst-case risk, but without showing risk consistency. In this paper, we propose a minimizer based on a novel approximation theorem and provide the corresponding risk consistency results. Furthermore, we develop WDRO inference for locally perturbed data that include the Mixup (Zhang et al., 2017) as a special case. We show that our approximation and risk consistency results naturally extend to the cases when data are locally perturbed. Numerical experiments demonstrate robustness of the proposed method using image classification datasets. Our results show that the proposed method achieves significantly higher accuracy than baseline models on noisy datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-kwon20a, title = {Principled learning method for {W}asserstein distributionally robust optimization with local perturbations}, author = {Kwon, Yongchan and Kim, Wonyoung and Won, Joong-Ho and Paik, Myunghee Cho}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {5567--5576}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/kwon20a/kwon20a.pdf}, url = {http://proceedings.mlr.press/v119/kwon20a.html}, abstract = {Wasserstein distributionally robust optimization (WDRO) attempts to learn a model that minimizes the local worst-case risk in the vicinity of the empirical data distribution defined by Wasserstein ball. While WDRO has received attention as a promising tool for inference since its introduction, its theoretical understanding has not been fully matured. Gao et al. (2017) proposed a minimizer based on a tractable approximation of the local worst-case risk, but without showing risk consistency. In this paper, we propose a minimizer based on a novel approximation theorem and provide the corresponding risk consistency results. Furthermore, we develop WDRO inference for locally perturbed data that include the Mixup (Zhang et al., 2017) as a special case. We show that our approximation and risk consistency results naturally extend to the cases when data are locally perturbed. Numerical experiments demonstrate robustness of the proposed method using image classification datasets. Our results show that the proposed method achieves significantly higher accuracy than baseline models on noisy datasets.} }
Endnote
%0 Conference Paper %T Principled learning method for Wasserstein distributionally robust optimization with local perturbations %A Yongchan Kwon %A Wonyoung Kim %A Joong-Ho Won %A Myunghee Cho Paik %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-kwon20a %I PMLR %P 5567--5576 %U http://proceedings.mlr.press/v119/kwon20a.html %V 119 %X Wasserstein distributionally robust optimization (WDRO) attempts to learn a model that minimizes the local worst-case risk in the vicinity of the empirical data distribution defined by Wasserstein ball. While WDRO has received attention as a promising tool for inference since its introduction, its theoretical understanding has not been fully matured. Gao et al. (2017) proposed a minimizer based on a tractable approximation of the local worst-case risk, but without showing risk consistency. In this paper, we propose a minimizer based on a novel approximation theorem and provide the corresponding risk consistency results. Furthermore, we develop WDRO inference for locally perturbed data that include the Mixup (Zhang et al., 2017) as a special case. We show that our approximation and risk consistency results naturally extend to the cases when data are locally perturbed. Numerical experiments demonstrate robustness of the proposed method using image classification datasets. Our results show that the proposed method achieves significantly higher accuracy than baseline models on noisy datasets.
APA
Kwon, Y., Kim, W., Won, J. & Paik, M.C.. (2020). Principled learning method for Wasserstein distributionally robust optimization with local perturbations. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:5567-5576 Available from http://proceedings.mlr.press/v119/kwon20a.html.

Related Material