Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:27203-27221, 2022.

Abstract

Distributionally robust optimization (DRO) and invariant risk minimization (IRM) are two popular methods proposed to improve out-of-distribution (OOD) generalization performance of machine learning models. While effective for small models, it has been observed that these methods can be vulnerable to overfitting with large overparameterized models. This work proposes a principled method, Model Agnostic samPLe rEweighting (MAPLE), to effectively address OOD problem, especially in overparameterized scenarios. Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model on the weighted training data leads to superior OOD generalization performance. The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size. We present theoretical analysis in linear case to prove the insensitivity of MAPLE to model size, and empirically verify its superiority in surpassing state-of-the-art methods by a large margin.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-zhou22d, title = {Model Agnostic Sample Reweighting for Out-of-Distribution Learning}, author = {Zhou, Xiao and Lin, Yong and Pi, Renjie and Zhang, Weizhong and Xu, Renzhe and Cui, Peng and Zhang, Tong}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {27203--27221}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/zhou22d/zhou22d.pdf}, url = {https://proceedings.mlr.press/v162/zhou22d.html}, abstract = {Distributionally robust optimization (DRO) and invariant risk minimization (IRM) are two popular methods proposed to improve out-of-distribution (OOD) generalization performance of machine learning models. While effective for small models, it has been observed that these methods can be vulnerable to overfitting with large overparameterized models. This work proposes a principled method, Model Agnostic samPLe rEweighting (MAPLE), to effectively address OOD problem, especially in overparameterized scenarios. Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model on the weighted training data leads to superior OOD generalization performance. The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size. We present theoretical analysis in linear case to prove the insensitivity of MAPLE to model size, and empirically verify its superiority in surpassing state-of-the-art methods by a large margin.} }
Endnote
%0 Conference Paper %T Model Agnostic Sample Reweighting for Out-of-Distribution Learning %A Xiao Zhou %A Yong Lin %A Renjie Pi %A Weizhong Zhang %A Renzhe Xu %A Peng Cui %A Tong Zhang %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-zhou22d %I PMLR %P 27203--27221 %U https://proceedings.mlr.press/v162/zhou22d.html %V 162 %X Distributionally robust optimization (DRO) and invariant risk minimization (IRM) are two popular methods proposed to improve out-of-distribution (OOD) generalization performance of machine learning models. While effective for small models, it has been observed that these methods can be vulnerable to overfitting with large overparameterized models. This work proposes a principled method, Model Agnostic samPLe rEweighting (MAPLE), to effectively address OOD problem, especially in overparameterized scenarios. Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model on the weighted training data leads to superior OOD generalization performance. The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size. We present theoretical analysis in linear case to prove the insensitivity of MAPLE to model size, and empirically verify its superiority in surpassing state-of-the-art methods by a large margin.
APA
Zhou, X., Lin, Y., Pi, R., Zhang, W., Xu, R., Cui, P. & Zhang, T.. (2022). Model Agnostic Sample Reweighting for Out-of-Distribution Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:27203-27221 Available from https://proceedings.mlr.press/v162/zhou22d.html.

Related Material