Faster saddle-point  optimization for solving large-scale Markov decision processes

Joan Bas Serrano; Gergely Neu

Faster saddle-point optimization for solving large-scale Markov decision processes

Joan Bas Serrano, Gergely Neu

Proceedings of the 2nd Conference on Learning for Dynamics and Control, PMLR 120:413-423, 2020.

Abstract

We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.

Cite this Paper

BibTeX


@InProceedings{pmlr-v120-serrano20a,
  title = 	 {Faster saddle-point  optimization for solving large-scale Markov decision processes},
  author =       {Serrano, Joan Bas and Neu, Gergely},
  booktitle = 	 {Proceedings of the 2nd Conference on Learning for Dynamics and Control},
  pages = 	 {413--423},
  year = 	 {2020},
  editor = 	 {Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie},
  volume = 	 {120},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v120/serrano20a/serrano20a.pdf},
  url = 	 {https://proceedings.mlr.press/v120/serrano20a.html},
  abstract = 	 {We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.}
}

Endnote

%0 Conference Paper
%T Faster saddle-point  optimization for solving large-scale Markov decision processes
%A Joan Bas Serrano
%A Gergely Neu
%B Proceedings of the 2nd Conference on Learning for Dynamics and Control
%C Proceedings of Machine Learning Research
%D 2020
%E Alexandre M. Bayen
%E Ali Jadbabaie
%E George Pappas
%E Pablo A. Parrilo
%E Benjamin Recht
%E Claire Tomlin
%E Melanie Zeilinger	
%F pmlr-v120-serrano20a
%I PMLR
%P 413--423
%U https://proceedings.mlr.press/v120/serrano20a.html
%V 120
%X We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.

APA


Serrano, J.B. & Neu, G.. (2020). Faster saddle-point  optimization for solving large-scale Markov decision processes. Proceedings of the 2nd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 120:413-423 Available from https://proceedings.mlr.press/v120/serrano20a.html.

Related Material

Download PDF