Proxy Methods for Domain Adaptation

Katherine Tsai, Stephen R Pfohl, Olawale Salaudeen, Nicole Chiou, Matt Kusner, Alexander D’Amour, Sanmi Koyejo, Arthur Gretton
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3961-3969, 2024.

Abstract

We study the problem of domain adaptation under distribution shift, where the shift is due to a change in the distribution of an unobserved, latent variable that confounds both the covariates and the labels. In this setting, neither the covariate shift nor the label shift assumptions apply. Our approach to adaptation employs proximal causal learning, a technique for estimating causal effects in settings where proxies of unobserved confounders are available. We demonstrate that proxy variables allow for adaptation to distribution shift without explicitly recovering or modeling latent variables. We consider two settings, (i) Concept Bottleneck: an additional “concept” variable is observed that mediates the relationship between the covariates and labels; (ii) Multi-domain: training data from multiple source domains is available, where each source domain exhibits a different distribution over the latent confounder. We develop a two-stage kernel estimation approach to adapt to complex distribution shifts in both settings. In our experiments, we show that our approach outperforms other methods, notably those which explicitly recover the latent confounder.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-tsai24b, title = { Proxy Methods for Domain Adaptation }, author = {Tsai, Katherine and R Pfohl, Stephen and Salaudeen, Olawale and Chiou, Nicole and Kusner, Matt and D'Amour, Alexander and Koyejo, Sanmi and Gretton, Arthur}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3961--3969}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/tsai24b/tsai24b.pdf}, url = {https://proceedings.mlr.press/v238/tsai24b.html}, abstract = { We study the problem of domain adaptation under distribution shift, where the shift is due to a change in the distribution of an unobserved, latent variable that confounds both the covariates and the labels. In this setting, neither the covariate shift nor the label shift assumptions apply. Our approach to adaptation employs proximal causal learning, a technique for estimating causal effects in settings where proxies of unobserved confounders are available. We demonstrate that proxy variables allow for adaptation to distribution shift without explicitly recovering or modeling latent variables. We consider two settings, (i) Concept Bottleneck: an additional “concept” variable is observed that mediates the relationship between the covariates and labels; (ii) Multi-domain: training data from multiple source domains is available, where each source domain exhibits a different distribution over the latent confounder. We develop a two-stage kernel estimation approach to adapt to complex distribution shifts in both settings. In our experiments, we show that our approach outperforms other methods, notably those which explicitly recover the latent confounder. } }
Endnote
%0 Conference Paper %T Proxy Methods for Domain Adaptation %A Katherine Tsai %A Stephen R Pfohl %A Olawale Salaudeen %A Nicole Chiou %A Matt Kusner %A Alexander D’Amour %A Sanmi Koyejo %A Arthur Gretton %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-tsai24b %I PMLR %P 3961--3969 %U https://proceedings.mlr.press/v238/tsai24b.html %V 238 %X We study the problem of domain adaptation under distribution shift, where the shift is due to a change in the distribution of an unobserved, latent variable that confounds both the covariates and the labels. In this setting, neither the covariate shift nor the label shift assumptions apply. Our approach to adaptation employs proximal causal learning, a technique for estimating causal effects in settings where proxies of unobserved confounders are available. We demonstrate that proxy variables allow for adaptation to distribution shift without explicitly recovering or modeling latent variables. We consider two settings, (i) Concept Bottleneck: an additional “concept” variable is observed that mediates the relationship between the covariates and labels; (ii) Multi-domain: training data from multiple source domains is available, where each source domain exhibits a different distribution over the latent confounder. We develop a two-stage kernel estimation approach to adapt to complex distribution shifts in both settings. In our experiments, we show that our approach outperforms other methods, notably those which explicitly recover the latent confounder.
APA
Tsai, K., R Pfohl, S., Salaudeen, O., Chiou, N., Kusner, M., D’Amour, A., Koyejo, S. & Gretton, A.. (2024). Proxy Methods for Domain Adaptation . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3961-3969 Available from https://proceedings.mlr.press/v238/tsai24b.html.

Related Material