Finite-time theory for momentum Q-learning

Weng Bowen; Xiong Huaqing; Zhao Lin; Liang Yingbin; Zhang Wei

Finite-time theory for momentum Q-learning

Weng Bowen, Xiong Huaqing, Zhao Lin, Liang Yingbin, Zhang Wei

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:665-674, 2021.

Abstract

Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-time analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximation. This paper analyzes a class of momentum-based Q-learning algorithms with finite-time convergence guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov’s and Polyak’s momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximation under Markovian sampling. In particular, we characterize a finite-time convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-time analysis for momentum-based Q-learning algorithms with function approximation. For the tabular case under synchronous sampling, we also obtain a finite-time convergence rate that is slightly better than the SpeedyQ (Azar et al., NIPS 2011). Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v161-bowen21a,
  title = 	 {Finite-time theory for momentum Q-learning },
  author =       {Bowen, Weng and Huaqing, Xiong and Lin, Zhao and Yingbin, Liang and Wei, Zhang},
  booktitle = 	 {Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {665--674},
  year = 	 {2021},
  editor = 	 {de Campos, Cassio and Maathuis, Marloes H.},
  volume = 	 {161},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {27--30 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v161/bowen21a/bowen21a.pdf},
  url = 	 {https://proceedings.mlr.press/v161/bowen21a.html},
  abstract = 	 {Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-time analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximation. This paper analyzes a class of momentum-based Q-learning algorithms with finite-time convergence guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov’s and Polyak’s momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximation under Markovian sampling. In particular, we characterize a finite-time convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-time analysis for momentum-based Q-learning algorithms with function approximation. For the tabular case under synchronous sampling, we also obtain a finite-time convergence rate that is slightly better than the SpeedyQ (Azar et al., NIPS 2011). Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.}
}

Endnote

%0 Conference Paper
%T Finite-time theory for momentum Q-learning 
%A Weng Bowen
%A Xiong Huaqing
%A Zhao Lin
%A Liang Yingbin
%A Zhang Wei
%B Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2021
%E Cassio de Campos
%E Marloes H. Maathuis	
%F pmlr-v161-bowen21a
%I PMLR
%P 665--674
%U https://proceedings.mlr.press/v161/bowen21a.html
%V 161
%X Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-time analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximation. This paper analyzes a class of momentum-based Q-learning algorithms with finite-time convergence guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov’s and Polyak’s momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximation under Markovian sampling. In particular, we characterize a finite-time convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-time analysis for momentum-based Q-learning algorithms with function approximation. For the tabular case under synchronous sampling, we also obtain a finite-time convergence rate that is slightly better than the SpeedyQ (Azar et al., NIPS 2011). Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

APA


Bowen, W., Huaqing, X., Lin, Z., Yingbin, L. & Wei, Z.. (2021). Finite-time theory for momentum Q-learning . Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 161:665-674 Available from https://proceedings.mlr.press/v161/bowen21a.html.

Finite-time theory for momentum Q-learning

Abstract

Cite this Paper

Related Material