Taylor Expansion Policy Optimization

Yunhao Tang, Michal Valko, Remi Munos
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:9397-9406, 2020.

Abstract

In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor Expansion Policy Optimization, a policy optimization formalism that generalizes prior work as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-tang20d, title = {Taylor Expansion Policy Optimization}, author = {Tang, Yunhao and Valko, Michal and Munos, Remi}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {9397--9406}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/tang20d/tang20d.pdf}, url = {https://proceedings.mlr.press/v119/tang20d.html}, abstract = {In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor Expansion Policy Optimization, a policy optimization formalism that generalizes prior work as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.} }
Endnote
%0 Conference Paper %T Taylor Expansion Policy Optimization %A Yunhao Tang %A Michal Valko %A Remi Munos %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-tang20d %I PMLR %P 9397--9406 %U https://proceedings.mlr.press/v119/tang20d.html %V 119 %X In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor Expansion Policy Optimization, a policy optimization formalism that generalizes prior work as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.
APA
Tang, Y., Valko, M. & Munos, R.. (2020). Taylor Expansion Policy Optimization. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:9397-9406 Available from https://proceedings.mlr.press/v119/tang20d.html.

Related Material