A causal bandit approach to learning good atomic interventions in presence of unobserved confounders

Aurghya Maiti; Vineet Nair; Gaurav Sinha

A causal bandit approach to learning good atomic interventions in presence of unobserved confounders

Aurghya Maiti, Vineet Nair, Gaurav Sinha

Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PMLR 180:1328-1338, 2022.

Abstract

We study the problem of determining the best atomic intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We model this as a stochastic multi-armed bandit (MAB) problem with side-information, where interventions on CBN correspond to arms of the bandit instance. First, we propose a simple regret minimization algorithm that takes as input a causal graph with observable and unobservable nodes and in

$T$ exploration rounds achieves

$\tilde{O}(\sqrt{m(\mathcal{C})/T})$ expected simple regret. Here

$m(\mathcal{C})$ is a parameter dependent on the input CBN

$\mathcal{C}$ and could be much smaller than the number of arms. We also show that this is almost optimal for CBNs whose causal graphs have an

$n$ -ary tree structure. Next, we propose a cumulative regret minimization algorithm that takes as input a causal graph with observable nodes and performs better than the optimal MAB algorithms that do not use causal side-information. We experimentally compare both our algorithms with the best known algorithms in the literature.

Cite this Paper

BibTeX


@InProceedings{pmlr-v180-maiti22a,
  title = 	 {A causal bandit approach to learning good atomic interventions in presence of unobserved confounders},
  author =       {Maiti, Aurghya and Nair, Vineet and Sinha, Gaurav},
  booktitle = 	 {Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {1328--1338},
  year = 	 {2022},
  editor = 	 {Cussens, James and Zhang, Kun},
  volume = 	 {180},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {01--05 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v180/maiti22a/maiti22a.pdf},
  url = 	 {https://proceedings.mlr.press/v180/maiti22a.html},
  abstract = 	 {We study the problem of determining the best atomic intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We model this as a stochastic multi-armed bandit (MAB) problem with side-information, where interventions on CBN correspond to arms of the bandit instance. First, we propose a simple regret minimization algorithm that takes as input a causal graph with observable and unobservable nodes and in $T$ exploration rounds achieves $\tilde{O}(\sqrt{m(\mathcal{C})/T})$ expected simple regret. Here $m(\mathcal{C})$ is a parameter dependent on the input CBN $\mathcal{C}$ and could be much smaller than the number of arms. We also show that this is almost optimal for CBNs whose causal graphs have an $n$-ary tree structure.  Next, we propose a cumulative regret minimization algorithm that takes as input a causal graph with observable nodes and performs better than the optimal MAB algorithms that do not use causal side-information. We experimentally compare both our algorithms with the best known algorithms in the literature.}
}

Endnote

%0 Conference Paper
%T A causal bandit approach to learning good atomic interventions in presence of unobserved confounders
%A Aurghya Maiti
%A Vineet Nair
%A Gaurav Sinha
%B Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2022
%E James Cussens
%E Kun Zhang	
%F pmlr-v180-maiti22a
%I PMLR
%P 1328--1338
%U https://proceedings.mlr.press/v180/maiti22a.html
%V 180
%X We study the problem of determining the best atomic intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We model this as a stochastic multi-armed bandit (MAB) problem with side-information, where interventions on CBN correspond to arms of the bandit instance. First, we propose a simple regret minimization algorithm that takes as input a causal graph with observable and unobservable nodes and in $T$ exploration rounds achieves $\tilde{O}(\sqrt{m(\mathcal{C})/T})$ expected simple regret. Here $m(\mathcal{C})$ is a parameter dependent on the input CBN $\mathcal{C}$ and could be much smaller than the number of arms. We also show that this is almost optimal for CBNs whose causal graphs have an $n$-ary tree structure.  Next, we propose a cumulative regret minimization algorithm that takes as input a causal graph with observable nodes and performs better than the optimal MAB algorithms that do not use causal side-information. We experimentally compare both our algorithms with the best known algorithms in the literature.

APA


Maiti, A., Nair, V. & Sinha, G.. (2022). A causal bandit approach to learning good atomic interventions in presence of unobserved confounders. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 180:1328-1338 Available from https://proceedings.mlr.press/v180/maiti22a.html.

A causal bandit approach to learning good atomic interventions in presence of unobserved confounders

Abstract

Cite this Paper

Related Material