Low-Dimensional Density Ratio Estimation for Covariate Shift Correction

Petar Stojanov, Mingming Gong, Jaime Carbonell, Kun Zhang
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:3449-3458, 2019.

Abstract

Covariate shift is a prevalent setting for supervised learning in the wild when the training and test data are drawn from different time periods, different but related domains, or via different sampling strategies. This paper addresses a transfer learning setting, with covariate shift between source and target domains. Most existing methods for correcting covariate shift exploit density ratios of the features to reweight the source-domain data, and when the features are high-dimensional, the estimated density ratios may suffer large estimation variances, leading to poor performance of prediction under covariate shift. In this work, we investigate the dependence of covariate shift correction performance on the dimensionality of the features, and propose a correction method that finds a low-dimensional representation of the features, which takes into account feature relevant to the target $Y$, and exploits the density ratio of this representation for importance reweighting. We discuss the factors that affect the performance of our method, and demonstrate its capabilities on both pseudo-real data and real-world applications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-stojanov19a, title = {Low-Dimensional Density Ratio Estimation for Covariate Shift Correction}, author = {Stojanov, Petar and Gong, Mingming and Carbonell, Jaime and Zhang, Kun}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {3449--3458}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/stojanov19a/stojanov19a.pdf}, url = {http://proceedings.mlr.press/v89/stojanov19a.html}, abstract = {Covariate shift is a prevalent setting for supervised learning in the wild when the training and test data are drawn from different time periods, different but related domains, or via different sampling strategies. This paper addresses a transfer learning setting, with covariate shift between source and target domains. Most existing methods for correcting covariate shift exploit density ratios of the features to reweight the source-domain data, and when the features are high-dimensional, the estimated density ratios may suffer large estimation variances, leading to poor performance of prediction under covariate shift. In this work, we investigate the dependence of covariate shift correction performance on the dimensionality of the features, and propose a correction method that finds a low-dimensional representation of the features, which takes into account feature relevant to the target $Y$, and exploits the density ratio of this representation for importance reweighting. We discuss the factors that affect the performance of our method, and demonstrate its capabilities on both pseudo-real data and real-world applications.} }
Endnote
%0 Conference Paper %T Low-Dimensional Density Ratio Estimation for Covariate Shift Correction %A Petar Stojanov %A Mingming Gong %A Jaime Carbonell %A Kun Zhang %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-stojanov19a %I PMLR %P 3449--3458 %U http://proceedings.mlr.press/v89/stojanov19a.html %V 89 %X Covariate shift is a prevalent setting for supervised learning in the wild when the training and test data are drawn from different time periods, different but related domains, or via different sampling strategies. This paper addresses a transfer learning setting, with covariate shift between source and target domains. Most existing methods for correcting covariate shift exploit density ratios of the features to reweight the source-domain data, and when the features are high-dimensional, the estimated density ratios may suffer large estimation variances, leading to poor performance of prediction under covariate shift. In this work, we investigate the dependence of covariate shift correction performance on the dimensionality of the features, and propose a correction method that finds a low-dimensional representation of the features, which takes into account feature relevant to the target $Y$, and exploits the density ratio of this representation for importance reweighting. We discuss the factors that affect the performance of our method, and demonstrate its capabilities on both pseudo-real data and real-world applications.
APA
Stojanov, P., Gong, M., Carbonell, J. & Zhang, K.. (2019). Low-Dimensional Density Ratio Estimation for Covariate Shift Correction. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:3449-3458 Available from http://proceedings.mlr.press/v89/stojanov19a.html.

Related Material