Does Distributionally Robust Supervised Learning Give Robust Classifiers?

Weihua Hu, Gang Niu, Issei Sato, Masashi Sugiyama
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:2029-2037, 2018.

Abstract

Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is deployed in the real world, its performance can be significantly degraded because test data may follow a different distribution from training data. DRSL with f-divergences explicitly considers the worst-case distribution shift by minimizing the adversarially reweighted training loss. In this paper, we analyze this DRSL, focusing on the classification scenario. Since the DRSL is explicitly formulated for a distribution shift scenario, we naturally expect it to give a robust classifier that can aggressively handle shifted distributions. However, surprisingly, we prove that the DRSL just ends up giving a classifier that exactly fits the given training distribution, which is too pessimistic. This pessimism comes from two sources: the particular losses used in classification and the fact that the variety of distributions to which the DRSL tries to be robust is too wide. Motivated by our analysis, we propose simple DRSL that overcomes this pessimism and empirically demonstrate its effectiveness.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-hu18a, title = {Does Distributionally Robust Supervised Learning Give Robust Classifiers?}, author = {Hu, Weihua and Niu, Gang and Sato, Issei and Sugiyama, Masashi}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {2029--2037}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/hu18a/hu18a.pdf}, url = {https://proceedings.mlr.press/v80/hu18a.html}, abstract = {Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is deployed in the real world, its performance can be significantly degraded because test data may follow a different distribution from training data. DRSL with f-divergences explicitly considers the worst-case distribution shift by minimizing the adversarially reweighted training loss. In this paper, we analyze this DRSL, focusing on the classification scenario. Since the DRSL is explicitly formulated for a distribution shift scenario, we naturally expect it to give a robust classifier that can aggressively handle shifted distributions. However, surprisingly, we prove that the DRSL just ends up giving a classifier that exactly fits the given training distribution, which is too pessimistic. This pessimism comes from two sources: the particular losses used in classification and the fact that the variety of distributions to which the DRSL tries to be robust is too wide. Motivated by our analysis, we propose simple DRSL that overcomes this pessimism and empirically demonstrate its effectiveness.} }
Endnote
%0 Conference Paper %T Does Distributionally Robust Supervised Learning Give Robust Classifiers? %A Weihua Hu %A Gang Niu %A Issei Sato %A Masashi Sugiyama %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-hu18a %I PMLR %P 2029--2037 %U https://proceedings.mlr.press/v80/hu18a.html %V 80 %X Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is deployed in the real world, its performance can be significantly degraded because test data may follow a different distribution from training data. DRSL with f-divergences explicitly considers the worst-case distribution shift by minimizing the adversarially reweighted training loss. In this paper, we analyze this DRSL, focusing on the classification scenario. Since the DRSL is explicitly formulated for a distribution shift scenario, we naturally expect it to give a robust classifier that can aggressively handle shifted distributions. However, surprisingly, we prove that the DRSL just ends up giving a classifier that exactly fits the given training distribution, which is too pessimistic. This pessimism comes from two sources: the particular losses used in classification and the fact that the variety of distributions to which the DRSL tries to be robust is too wide. Motivated by our analysis, we propose simple DRSL that overcomes this pessimism and empirically demonstrate its effectiveness.
APA
Hu, W., Niu, G., Sato, I. & Sugiyama, M.. (2018). Does Distributionally Robust Supervised Learning Give Robust Classifiers?. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:2029-2037 Available from https://proceedings.mlr.press/v80/hu18a.html.

Related Material