Multi-Domain Causal Representation Learning via Weak Distributional Invariances

Kartik Ahuja, Amin Mansouri, Yixin Wang
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:865-873, 2024.

Abstract

Causal representation learning has emerged as the center of action in causal machine learning research. In particular, multi-domain datasets present a natural opportunity for showcasing the advantages of causal representation learning over standard unsupervised representation learning. While recent works have taken crucial steps towards learning causal representations, they often lack applicability to multi-domain datasets due to over-simplifying assumptions about the data; e.g. each domain comes from a different single-node perfect intervention. In this work, we relax these assumptions and capitalize on the following observation: there often exists a subset of latents whose certain distributional properties (e.g., support, variance) remain stable across domains; this property holds when, for example, each domain comes from a multi-node imperfect intervention. Leveraging this observation, we show that autoencoders that incorporate such invariances can provably identify the stable set of latents from the rest across different settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-ahuja24a, title = {Multi-Domain Causal Representation Learning via Weak Distributional Invariances}, author = {Ahuja, Kartik and Mansouri, Amin and Wang, Yixin}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {865--873}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/ahuja24a/ahuja24a.pdf}, url = {https://proceedings.mlr.press/v238/ahuja24a.html}, abstract = {Causal representation learning has emerged as the center of action in causal machine learning research. In particular, multi-domain datasets present a natural opportunity for showcasing the advantages of causal representation learning over standard unsupervised representation learning. While recent works have taken crucial steps towards learning causal representations, they often lack applicability to multi-domain datasets due to over-simplifying assumptions about the data; e.g. each domain comes from a different single-node perfect intervention. In this work, we relax these assumptions and capitalize on the following observation: there often exists a subset of latents whose certain distributional properties (e.g., support, variance) remain stable across domains; this property holds when, for example, each domain comes from a multi-node imperfect intervention. Leveraging this observation, we show that autoencoders that incorporate such invariances can provably identify the stable set of latents from the rest across different settings.} }
Endnote
%0 Conference Paper %T Multi-Domain Causal Representation Learning via Weak Distributional Invariances %A Kartik Ahuja %A Amin Mansouri %A Yixin Wang %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-ahuja24a %I PMLR %P 865--873 %U https://proceedings.mlr.press/v238/ahuja24a.html %V 238 %X Causal representation learning has emerged as the center of action in causal machine learning research. In particular, multi-domain datasets present a natural opportunity for showcasing the advantages of causal representation learning over standard unsupervised representation learning. While recent works have taken crucial steps towards learning causal representations, they often lack applicability to multi-domain datasets due to over-simplifying assumptions about the data; e.g. each domain comes from a different single-node perfect intervention. In this work, we relax these assumptions and capitalize on the following observation: there often exists a subset of latents whose certain distributional properties (e.g., support, variance) remain stable across domains; this property holds when, for example, each domain comes from a multi-node imperfect intervention. Leveraging this observation, we show that autoencoders that incorporate such invariances can provably identify the stable set of latents from the rest across different settings.
APA
Ahuja, K., Mansouri, A. & Wang, Y.. (2024). Multi-Domain Causal Representation Learning via Weak Distributional Invariances. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:865-873 Available from https://proceedings.mlr.press/v238/ahuja24a.html.

Related Material