Active Exploration via Experiment Design in Markov Chains

Mojmir Mutny; Tadeusz Janik; Andreas Krause

Active Exploration via Experiment Design in Markov Chains

Mojmir Mutny, Tadeusz Janik, Andreas Krause

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:7349-7374, 2023.

Abstract

A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest. Classical experimental design optimally allocates the experimental budget into measurements to maximize a notion of utility (e.g., reduction in uncertainty about the unknown quantity). We consider a rich setting, where the experiments are associated with states in a Markov chain, and we can only choose them by selecting a policy controlling the state transitions. This problem captures important applications, from exploration in reinforcement learning to spatial monitoring tasks. We propose an algorithm – markov-design – that efficiently selects policies whose measurement allocation provably converges to the optimal one. The algorithm is sequential in nature, adapting its choice of policies (experiments) using past measurements. In addition to our theoretical analysis, we demonstrate our framework on applications in ecological surveillance and pharmacology.

Cite this Paper

BibTeX


@InProceedings{pmlr-v206-mutny23a,
  title = 	 {Active Exploration via Experiment Design in Markov Chains},
  author =       {Mutny, Mojmir and Janik, Tadeusz and Krause, Andreas},
  booktitle = 	 {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {7349--7374},
  year = 	 {2023},
  editor = 	 {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
  volume = 	 {206},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v206/mutny23a/mutny23a.pdf},
  url = 	 {https://proceedings.mlr.press/v206/mutny23a.html},
  abstract = 	 {A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest. Classical experimental design optimally allocates the experimental budget into measurements to maximize a notion of utility (e.g., reduction in uncertainty about the unknown quantity). We consider a rich setting, where the experiments are associated with states in a Markov chain, and we can only choose them by selecting a policy controlling the state transitions. This problem captures important applications, from exploration in reinforcement learning to spatial monitoring tasks. We propose an algorithm – markov-design – that efficiently selects policies whose measurement allocation provably converges to the optimal one. The algorithm is sequential in nature, adapting its choice of policies (experiments) using past measurements. In addition to our theoretical analysis, we demonstrate our framework on applications in ecological surveillance and pharmacology.}
}

Endnote

%0 Conference Paper
%T Active Exploration via Experiment Design in Markov Chains
%A Mojmir Mutny
%A Tadeusz Janik
%A Andreas Krause
%B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2023
%E Francisco Ruiz
%E Jennifer Dy
%E Jan-Willem van de Meent	
%F pmlr-v206-mutny23a
%I PMLR
%P 7349--7374
%U https://proceedings.mlr.press/v206/mutny23a.html
%V 206
%X A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest. Classical experimental design optimally allocates the experimental budget into measurements to maximize a notion of utility (e.g., reduction in uncertainty about the unknown quantity). We consider a rich setting, where the experiments are associated with states in a Markov chain, and we can only choose them by selecting a policy controlling the state transitions. This problem captures important applications, from exploration in reinforcement learning to spatial monitoring tasks. We propose an algorithm – markov-design – that efficiently selects policies whose measurement allocation provably converges to the optimal one. The algorithm is sequential in nature, adapting its choice of policies (experiments) using past measurements. In addition to our theoretical analysis, we demonstrate our framework on applications in ecological surveillance and pharmacology.

APA


Mutny, M., Janik, T. & Krause, A.. (2023). Active Exploration via Experiment Design in Markov Chains. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:7349-7374 Available from https://proceedings.mlr.press/v206/mutny23a.html.

Active Exploration via Experiment Design in Markov Chains

Abstract

Cite this Paper

Related Material