Integer Programming for Causal Structure Learning in the Presence of Latent Variables

Rui Chen, Sanjeeb Dash, Tian Gao
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1550-1560, 2021.

Abstract

The problem of finding an ancestral acyclic directed mixed graph (ADMG) that represents the causal relationships between a set of variables is an important area of research on causal inference. Most existing score-based structure learning methods focus on learning directed acyclic graph (DAG) models without latent variables. A number of score-based methods have recently been proposed for the ADMG learning, yet they are heuristic in nature and do not guarantee an optimal solution. We propose a novel exact score-based method that solves an integer programming (IP) formulation and returns a score-maximizing ancestral ADMG for a set of continuous variables that follow a multivariate Gaussian distribution. We generalize the state-of-the-art IP model for DAG learning problems and derive new classes of valid inequalities to formulate an IP model for ADMG learning. Empirically, our model can be solved efficiently for medium-sized problems and achieves better accuracy than state-of-the-art score-based methods as well as benchmark constraint-based methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-chen21c, title = {Integer Programming for Causal Structure Learning in the Presence of Latent Variables}, author = {Chen, Rui and Dash, Sanjeeb and Gao, Tian}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {1550--1560}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/chen21c/chen21c.pdf}, url = {https://proceedings.mlr.press/v139/chen21c.html}, abstract = {The problem of finding an ancestral acyclic directed mixed graph (ADMG) that represents the causal relationships between a set of variables is an important area of research on causal inference. Most existing score-based structure learning methods focus on learning directed acyclic graph (DAG) models without latent variables. A number of score-based methods have recently been proposed for the ADMG learning, yet they are heuristic in nature and do not guarantee an optimal solution. We propose a novel exact score-based method that solves an integer programming (IP) formulation and returns a score-maximizing ancestral ADMG for a set of continuous variables that follow a multivariate Gaussian distribution. We generalize the state-of-the-art IP model for DAG learning problems and derive new classes of valid inequalities to formulate an IP model for ADMG learning. Empirically, our model can be solved efficiently for medium-sized problems and achieves better accuracy than state-of-the-art score-based methods as well as benchmark constraint-based methods.} }
Endnote
%0 Conference Paper %T Integer Programming for Causal Structure Learning in the Presence of Latent Variables %A Rui Chen %A Sanjeeb Dash %A Tian Gao %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-chen21c %I PMLR %P 1550--1560 %U https://proceedings.mlr.press/v139/chen21c.html %V 139 %X The problem of finding an ancestral acyclic directed mixed graph (ADMG) that represents the causal relationships between a set of variables is an important area of research on causal inference. Most existing score-based structure learning methods focus on learning directed acyclic graph (DAG) models without latent variables. A number of score-based methods have recently been proposed for the ADMG learning, yet they are heuristic in nature and do not guarantee an optimal solution. We propose a novel exact score-based method that solves an integer programming (IP) formulation and returns a score-maximizing ancestral ADMG for a set of continuous variables that follow a multivariate Gaussian distribution. We generalize the state-of-the-art IP model for DAG learning problems and derive new classes of valid inequalities to formulate an IP model for ADMG learning. Empirically, our model can be solved efficiently for medium-sized problems and achieves better accuracy than state-of-the-art score-based methods as well as benchmark constraint-based methods.
APA
Chen, R., Dash, S. & Gao, T.. (2021). Integer Programming for Causal Structure Learning in the Presence of Latent Variables. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1550-1560 Available from https://proceedings.mlr.press/v139/chen21c.html.

Related Material