POPCORN: Partially Observed Prediction Constrained Reinforcement Learning

Joseph Futoma; Michael Hughes; Finale Doshi-Velez

POPCORN: Partially Observed Prediction Constrained Reinforcement Learning

Joseph Futoma, Michael Hughes, Finale Doshi-Velez

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3578-3588, 2020.

Abstract

Many medical decision-making tasks can be framed as partially observed Markov decision processes (POMDPs). However, prevailing two-stage approaches that first learn a POMDP and then solve it often fail because the model that best fits the data may not be well suited for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in batch off-policy settings that are typical in healthcare, when only retrospective data is available. We demonstrate our approach on synthetic examples and a challenging medical decision-making problem.

Cite this Paper

BibTeX

@InProceedings{pmlr-v108-futoma20a,
  title = 	 {POPCORN: Partially Observed Prediction Constrained Reinforcement Learning},
  author =       {Futoma, Joseph and Hughes, Michael and Doshi-Velez, Finale},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3578--3588},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/futoma20a/futoma20a.pdf},
  url = 	 {https://proceedings.mlr.press/v108/futoma20a.html},
  abstract = 	 {Many medical decision-making tasks can be framed as partially observed Markov decision processes (POMDPs). However, prevailing two-stage approaches that first learn a POMDP and then solve it often fail because the model that best fits the data may not be well suited for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in batch off-policy settings that are typical in healthcare, when only retrospective data is available. We demonstrate our approach on synthetic examples and a challenging medical decision-making problem.}
}

Endnote

%0 Conference Paper
%T POPCORN: Partially Observed Prediction Constrained Reinforcement Learning
%A Joseph Futoma
%A Michael Hughes
%A Finale Doshi-Velez
%B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2020
%E Silvia Chiappa
%E Roberto Calandra	
%F pmlr-v108-futoma20a
%I PMLR
%P 3578--3588
%U https://proceedings.mlr.press/v108/futoma20a.html
%V 108
%X Many medical decision-making tasks can be framed as partially observed Markov decision processes (POMDPs). However, prevailing two-stage approaches that first learn a POMDP and then solve it often fail because the model that best fits the data may not be well suited for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in batch off-policy settings that are typical in healthcare, when only retrospective data is available. We demonstrate our approach on synthetic examples and a challenging medical decision-making problem.

APA

Futoma, J., Hughes, M. & Doshi-Velez, F.. (2020). POPCORN: Partially Observed Prediction Constrained Reinforcement Learning. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3578-3588 Available from https://proceedings.mlr.press/v108/futoma20a.html.

POPCORN: Partially Observed Prediction Constrained Reinforcement Learning

Abstract

Cite this Paper

Related Material