Faster saddle-point optimization for solving large-scale Markov decision processes

Joan Bas Serrano, Gergely Neu
Proceedings of the 2nd Conference on Learning for Dynamics and Control, PMLR 120:413-423, 2020.

Abstract

We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.

Cite this Paper


BibTeX
@InProceedings{pmlr-v120-serrano20a, title = {Faster saddle-point optimization for solving large-scale Markov decision processes}, author = {Serrano, Joan Bas and Neu, Gergely}, booktitle = {Proceedings of the 2nd Conference on Learning for Dynamics and Control}, pages = {413--423}, year = {2020}, editor = {Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie}, volume = {120}, series = {Proceedings of Machine Learning Research}, month = {10--11 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v120/serrano20a/serrano20a.pdf}, url = {https://proceedings.mlr.press/v120/serrano20a.html}, abstract = {We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.} }
Endnote
%0 Conference Paper %T Faster saddle-point optimization for solving large-scale Markov decision processes %A Joan Bas Serrano %A Gergely Neu %B Proceedings of the 2nd Conference on Learning for Dynamics and Control %C Proceedings of Machine Learning Research %D 2020 %E Alexandre M. Bayen %E Ali Jadbabaie %E George Pappas %E Pablo A. Parrilo %E Benjamin Recht %E Claire Tomlin %E Melanie Zeilinger %F pmlr-v120-serrano20a %I PMLR %P 413--423 %U https://proceedings.mlr.press/v120/serrano20a.html %V 120 %X We consider the problem of computing optimal policies in average-reward Markov decision processes. This classical problem can be formulated as a linear program directly amenable to saddle-point optimization methods, albeit with a number of variables that is linear in the number of states. To address this issue, recent work has considered a linearly relaxed version of the resulting saddle-point problem. Our work aims at achieving a better understanding of this relaxed optimization problem by characterizing the conditions necessary for convergence to the optimal policy, and designing an optimization algorithm enjoying fast convergence rates that are independent of the size of the state space. Notably, our characterization points out some potential issues with previous work.
APA
Serrano, J.B. & Neu, G.. (2020). Faster saddle-point optimization for solving large-scale Markov decision processes. Proceedings of the 2nd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 120:413-423 Available from https://proceedings.mlr.press/v120/serrano20a.html.

Related Material