Growing Q-networks: Solving continuous control tasks with adaptive control resolution

Tim Seyde, Peter Werner, Wilko Schwarting, Markus Wulfmeier, Daniela Rus
Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:1646-1661, 2024.

Abstract

Recent reinforcement learning approaches have shown surprisingly strong capabilities of bang-bang policies for solving continuous control benchmarks. The underlying coarse action space discretizations often yield favorable exploration characteristics, while final performance does not visibly suffer in the absence of action penalization in line with optimal control theory. In robotics applications, smooth control signals are commonly preferred to reduce system wear and improve energy efficiency, while regularization via action costs can be detrimental to exploration. Our work aims to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution. We take advantage of recent results in decoupled Q-learning to scale our approach to high-dimensional action spaces up to dim(A) = 38. Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that enable surprisingly strong performance on continuous control tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v242-seyde24a, title = {Growing {Q}-networks: {S}olving continuous control tasks with adaptive control resolution}, author = {Seyde, Tim and Werner, Peter and Schwarting, Wilko and Wulfmeier, Markus and Rus, Daniela}, booktitle = {Proceedings of the 6th Annual Learning for Dynamics & Control Conference}, pages = {1646--1661}, year = {2024}, editor = {Abate, Alessandro and Cannon, Mark and Margellos, Kostas and Papachristodoulou, Antonis}, volume = {242}, series = {Proceedings of Machine Learning Research}, month = {15--17 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v242/seyde24a/seyde24a.pdf}, url = {https://proceedings.mlr.press/v242/seyde24a.html}, abstract = {Recent reinforcement learning approaches have shown surprisingly strong capabilities of bang-bang policies for solving continuous control benchmarks. The underlying coarse action space discretizations often yield favorable exploration characteristics, while final performance does not visibly suffer in the absence of action penalization in line with optimal control theory. In robotics applications, smooth control signals are commonly preferred to reduce system wear and improve energy efficiency, while regularization via action costs can be detrimental to exploration. Our work aims to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution. We take advantage of recent results in decoupled Q-learning to scale our approach to high-dimensional action spaces up to dim(A) = 38. Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that enable surprisingly strong performance on continuous control tasks.} }
Endnote
%0 Conference Paper %T Growing Q-networks: Solving continuous control tasks with adaptive control resolution %A Tim Seyde %A Peter Werner %A Wilko Schwarting %A Markus Wulfmeier %A Daniela Rus %B Proceedings of the 6th Annual Learning for Dynamics & Control Conference %C Proceedings of Machine Learning Research %D 2024 %E Alessandro Abate %E Mark Cannon %E Kostas Margellos %E Antonis Papachristodoulou %F pmlr-v242-seyde24a %I PMLR %P 1646--1661 %U https://proceedings.mlr.press/v242/seyde24a.html %V 242 %X Recent reinforcement learning approaches have shown surprisingly strong capabilities of bang-bang policies for solving continuous control benchmarks. The underlying coarse action space discretizations often yield favorable exploration characteristics, while final performance does not visibly suffer in the absence of action penalization in line with optimal control theory. In robotics applications, smooth control signals are commonly preferred to reduce system wear and improve energy efficiency, while regularization via action costs can be detrimental to exploration. Our work aims to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution. We take advantage of recent results in decoupled Q-learning to scale our approach to high-dimensional action spaces up to dim(A) = 38. Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that enable surprisingly strong performance on continuous control tasks.
APA
Seyde, T., Werner, P., Schwarting, W., Wulfmeier, M. & Rus, D.. (2024). Growing Q-networks: Solving continuous control tasks with adaptive control resolution. Proceedings of the 6th Annual Learning for Dynamics & Control Conference, in Proceedings of Machine Learning Research 242:1646-1661 Available from https://proceedings.mlr.press/v242/seyde24a.html.

Related Material