Connectivity-contrastive learning: Combining causal discovery and representation learning for multimodal data
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:3399-3426, 2023.
Causal discovery methods typically extract causal relations between multiple nodes (variables) based on univariate observations of each node. However, one frequently encounters situations where each node is multivariate, i.e. has multiple observational modalities. Furthermore, the observed modalities may be generated through an unknown mixing process, so that some original latent variables are entangled inside the nodes. In such a multimodal case, the existing frameworks cannot be applied. To analyze such data, we propose a new causal representation learning framework called connectivity-contrastive learning (CCL). CCL disentangles the observational mixing and extracts a set of mutually independent latent components, each having a separate causal structure between the nodes. The actual learning proceeds by a novel self-supervised learning method in which the pretext task is to predict the label of a pair of nodes from the observations of the node pairs. We present theorems which show that CCL can indeed identify both the latent components and the multimodal causal structure under weak technical assumptions, up to some indeterminacy. Finally, we experimentally show its superior causal discovery performance compared to state-of-the-art baselines, in particular demonstrating robustness against latent confounders.