Learning good interventions in causal graphs via covering

Ayush Sawarni, Rahul Madhavan, Gaurav Sinha, Siddharth Barman
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1827-1836, 2023.

Abstract

We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set A of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in A is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on N vertices with constant in-degree, prior work has achieved a worst case guarantee of O(N/sqrt(T)) for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within A) and establishes a simple regret guarantee of O(sqrt(N/T)). Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-sawarni23a, title = {Learning good interventions in causal graphs via covering}, author = {Sawarni, Ayush and Madhavan, Rahul and Sinha, Gaurav and Barman, Siddharth}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {1827--1836}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/sawarni23a/sawarni23a.pdf}, url = {https://proceedings.mlr.press/v216/sawarni23a.html}, abstract = {We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set A of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in A is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on N vertices with constant in-degree, prior work has achieved a worst case guarantee of O(N/sqrt(T)) for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within A) and establishes a simple regret guarantee of O(sqrt(N/T)). Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting.} }
Endnote
%0 Conference Paper %T Learning good interventions in causal graphs via covering %A Ayush Sawarni %A Rahul Madhavan %A Gaurav Sinha %A Siddharth Barman %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-sawarni23a %I PMLR %P 1827--1836 %U https://proceedings.mlr.press/v216/sawarni23a.html %V 216 %X We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set A of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in A is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on N vertices with constant in-degree, prior work has achieved a worst case guarantee of O(N/sqrt(T)) for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within A) and establishes a simple regret guarantee of O(sqrt(N/T)). Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting.
APA
Sawarni, A., Madhavan, R., Sinha, G. & Barman, S.. (2023). Learning good interventions in causal graphs via covering. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1827-1836 Available from https://proceedings.mlr.press/v216/sawarni23a.html.

Related Material