Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Tom Blau; Edwin V. Bonilla; Iadine Chades; Amir Dezfouli

Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Tom Blau, Edwin V. Bonilla, Iadine Chades, Amir Dezfouli

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2107-2128, 2022.

Abstract

Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches practical, by training a parameterized policy that proposes designs efficiently at deployment time. However, these methods may not sufficiently explore the design space, require access to a differentiable probabilistic model and can only optimize over continuous design spaces. Here, we address these limitations by showing that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP). We solve the equivalent MDP with modern deep reinforcement learning techniques. Our experiments show that our approach is also computationally efficient at deployment time and exhibits state-of-the-art performance on both continuous and discrete design spaces, even when the probabilistic model is a black box.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-blau22a,
  title = 	 {Optimizing Sequential Experimental Design with Deep Reinforcement Learning},
  author =       {Blau, Tom and Bonilla, Edwin V. and Chades, Iadine and Dezfouli, Amir},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {2107--2128},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/blau22a/blau22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/blau22a.html},
  abstract = 	 {Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches practical, by training a parameterized policy that proposes designs efficiently at deployment time. However, these methods may not sufficiently explore the design space, require access to a differentiable probabilistic model and can only optimize over continuous design spaces. Here, we address these limitations by showing that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP). We solve the equivalent MDP with modern deep reinforcement learning techniques. Our experiments show that our approach is also computationally efficient at deployment time and exhibits state-of-the-art performance on both continuous and discrete design spaces, even when the probabilistic model is a black box.}
}

Endnote

%0 Conference Paper
%T Optimizing Sequential Experimental Design with Deep Reinforcement Learning
%A Tom Blau
%A Edwin V. Bonilla
%A Iadine Chades
%A Amir Dezfouli
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-blau22a
%I PMLR
%P 2107--2128
%U https://proceedings.mlr.press/v162/blau22a.html
%V 162
%X Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches practical, by training a parameterized policy that proposes designs efficiently at deployment time. However, these methods may not sufficiently explore the design space, require access to a differentiable probabilistic model and can only optimize over continuous design spaces. Here, we address these limitations by showing that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP). We solve the equivalent MDP with modern deep reinforcement learning techniques. Our experiments show that our approach is also computationally efficient at deployment time and exhibits state-of-the-art performance on both continuous and discrete design spaces, even when the probabilistic model is a black box.

APA


Blau, T., Bonilla, E.V., Chades, I. & Dezfouli, A.. (2022). Optimizing Sequential Experimental Design with Deep Reinforcement Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:2107-2128 Available from https://proceedings.mlr.press/v162/blau22a.html.

Related Material

Download PDF