Minimax-Bayes Reinforcement Learning

Thomas Kleine Buening; Christos Dimitrakakis; Hannes Eriksson; Divya Grover; Emilio Jorge

Minimax-Bayes Reinforcement Learning

Thomas Kleine Buening, Christos Dimitrakakis, Hannes Eriksson, Divya Grover, Emilio Jorge

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:7511-7527, 2023.

Abstract

While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.

Cite this Paper

BibTeX


@InProceedings{pmlr-v206-buening23a,
  title = 	 {Minimax-Bayes Reinforcement Learning},
  author =       {Buening, Thomas Kleine and Dimitrakakis, Christos and Eriksson, Hannes and Grover, Divya and Jorge, Emilio},
  booktitle = 	 {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {7511--7527},
  year = 	 {2023},
  editor = 	 {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
  volume = 	 {206},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v206/buening23a/buening23a.pdf},
  url = 	 {https://proceedings.mlr.press/v206/buening23a.html},
  abstract = 	 {While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.}
}

Endnote

%0 Conference Paper
%T Minimax-Bayes Reinforcement Learning
%A Thomas Kleine Buening
%A Christos Dimitrakakis
%A Hannes Eriksson
%A Divya Grover
%A Emilio Jorge
%B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2023
%E Francisco Ruiz
%E Jennifer Dy
%E Jan-Willem van de Meent	
%F pmlr-v206-buening23a
%I PMLR
%P 7511--7527
%U https://proceedings.mlr.press/v206/buening23a.html
%V 206
%X While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution. One idea is to employ a worst-case prior. However, this is not as easy to specify in sequential decision making as in simple statistical estimation problems. This paper studies (sometimes approximate) minimax-Bayes solutions for various reinforcement learning problems to gain insights into the properties of the corresponding priors and policies. We find that while the worst-case prior depends on the setting, the corresponding minimax policies are more robust than those that assume a standard (i.e. uniform) prior.

APA


Buening, T.K., Dimitrakakis, C., Eriksson, H., Grover, D. & Jorge, E.. (2023). Minimax-Bayes Reinforcement Learning. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:7511-7527 Available from https://proceedings.mlr.press/v206/buening23a.html.

Minimax-Bayes Reinforcement Learning

Abstract

Cite this Paper

Related Material