Double-Weighting for Covariate Shift Adaptation

José I. Segovia-Martín, Santiago Mazuelas, Anqi Liu
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:30439-30457, 2023.

Abstract

Supervised learning is often affected by a covariate shift in which the marginal distributions of instances (covariates $x$) of training and testing samples $p_\text{tr}(x)$ and $p_\text{te}(x)$ are different but the label conditionals coincide. Existing approaches address such covariate shift by either using the ratio $p_\text{te}(x)/p_\text{tr}(x)$ to weight training samples (reweighted methods) or using the ratio $p_\text{tr}(x)/p_\text{te}(x)$ to weight testing samples (robust methods). However, the performance of such approaches can be poor under support mismatch or when the above ratios take large values. We propose a minimax risk classification (MRC) approach for covariate shift adaptation that avoids such limitations by weighting both training and testing samples. In addition, we develop effective techniques that obtain both sets of weights and generalize the conventional kernel mean matching method. We provide novel generalization bounds for our method that show a significant increase in the effective sample size compared with reweighted methods. The proposed method also achieves enhanced classification performance in both synthetic and empirical experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-segovia-martin23a, title = {Double-Weighting for Covariate Shift Adaptation}, author = {Segovia-Martín, José I. and Mazuelas, Santiago and Liu, Anqi}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {30439--30457}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/segovia-martin23a/segovia-martin23a.pdf}, url = {https://proceedings.mlr.press/v202/segovia-martin23a.html}, abstract = {Supervised learning is often affected by a covariate shift in which the marginal distributions of instances (covariates $x$) of training and testing samples $p_\text{tr}(x)$ and $p_\text{te}(x)$ are different but the label conditionals coincide. Existing approaches address such covariate shift by either using the ratio $p_\text{te}(x)/p_\text{tr}(x)$ to weight training samples (reweighted methods) or using the ratio $p_\text{tr}(x)/p_\text{te}(x)$ to weight testing samples (robust methods). However, the performance of such approaches can be poor under support mismatch or when the above ratios take large values. We propose a minimax risk classification (MRC) approach for covariate shift adaptation that avoids such limitations by weighting both training and testing samples. In addition, we develop effective techniques that obtain both sets of weights and generalize the conventional kernel mean matching method. We provide novel generalization bounds for our method that show a significant increase in the effective sample size compared with reweighted methods. The proposed method also achieves enhanced classification performance in both synthetic and empirical experiments.} }
Endnote
%0 Conference Paper %T Double-Weighting for Covariate Shift Adaptation %A José I. Segovia-Martín %A Santiago Mazuelas %A Anqi Liu %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-segovia-martin23a %I PMLR %P 30439--30457 %U https://proceedings.mlr.press/v202/segovia-martin23a.html %V 202 %X Supervised learning is often affected by a covariate shift in which the marginal distributions of instances (covariates $x$) of training and testing samples $p_\text{tr}(x)$ and $p_\text{te}(x)$ are different but the label conditionals coincide. Existing approaches address such covariate shift by either using the ratio $p_\text{te}(x)/p_\text{tr}(x)$ to weight training samples (reweighted methods) or using the ratio $p_\text{tr}(x)/p_\text{te}(x)$ to weight testing samples (robust methods). However, the performance of such approaches can be poor under support mismatch or when the above ratios take large values. We propose a minimax risk classification (MRC) approach for covariate shift adaptation that avoids such limitations by weighting both training and testing samples. In addition, we develop effective techniques that obtain both sets of weights and generalize the conventional kernel mean matching method. We provide novel generalization bounds for our method that show a significant increase in the effective sample size compared with reweighted methods. The proposed method also achieves enhanced classification performance in both synthetic and empirical experiments.
APA
Segovia-Martín, J.I., Mazuelas, S. & Liu, A.. (2023). Double-Weighting for Covariate Shift Adaptation. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:30439-30457 Available from https://proceedings.mlr.press/v202/segovia-martin23a.html.

Related Material