Dual Temporal Difference Learning

Min Yang, Yuxi Li, Dale Schuurmans
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, PMLR 5:631-638, 2009.

Abstract

Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of temporal difference learning using dual representations. We contribute significant progress by proving the convergence of dual temporal difference learning with eligibility traces. Experimental results suggest that the dual algorithms seem to demonstrate empirical benefits over standard primal algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v5-yang09a, title = {Dual Temporal Difference Learning}, author = {Yang, Min and Li, Yuxi and Schuurmans, Dale}, booktitle = {Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics}, pages = {631--638}, year = {2009}, editor = {van Dyk, David and Welling, Max}, volume = {5}, series = {Proceedings of Machine Learning Research}, address = {Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v5/yang09a/yang09a.pdf}, url = {https://proceedings.mlr.press/v5/yang09a.html}, abstract = {Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of temporal difference learning using dual representations. We contribute significant progress by proving the convergence of dual temporal difference learning with eligibility traces. Experimental results suggest that the dual algorithms seem to demonstrate empirical benefits over standard primal algorithms.} }
Endnote
%0 Conference Paper %T Dual Temporal Difference Learning %A Min Yang %A Yuxi Li %A Dale Schuurmans %B Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2009 %E David van Dyk %E Max Welling %F pmlr-v5-yang09a %I PMLR %P 631--638 %U https://proceedings.mlr.press/v5/yang09a.html %V 5 %X Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of temporal difference learning using dual representations. We contribute significant progress by proving the convergence of dual temporal difference learning with eligibility traces. Experimental results suggest that the dual algorithms seem to demonstrate empirical benefits over standard primal algorithms.
RIS
TY - CPAPER TI - Dual Temporal Difference Learning AU - Min Yang AU - Yuxi Li AU - Dale Schuurmans BT - Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics DA - 2009/04/15 ED - David van Dyk ED - Max Welling ID - pmlr-v5-yang09a PB - PMLR DP - Proceedings of Machine Learning Research VL - 5 SP - 631 EP - 638 L1 - http://proceedings.mlr.press/v5/yang09a/yang09a.pdf UR - https://proceedings.mlr.press/v5/yang09a.html AB - Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of temporal difference learning using dual representations. We contribute significant progress by proving the convergence of dual temporal difference learning with eligibility traces. Experimental results suggest that the dual algorithms seem to demonstrate empirical benefits over standard primal algorithms. ER -
APA
Yang, M., Li, Y. & Schuurmans, D.. (2009). Dual Temporal Difference Learning. Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 5:631-638 Available from https://proceedings.mlr.press/v5/yang09a.html.

Related Material