On the Convergence of Continuous Single-timescale Actor-critic

Xuyang Chen, Lin Zhao
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:9829-9859, 2025.

Abstract

Actor-critic algorithms have been instrumental in boosting the performance of numerous challenging applications involving continuous control, such as highly robust and agile robot motion control. However, their theoretical understanding remains largely underdeveloped. Existing analyses mostly focus on finite state-action spaces and on simplified variants of actor-critic, such as double-loop updates with i.i.d. sampling, which are often impractical for real-world applications. We consider the canonical and widely adopted single-timescale updates with Markovian sampling in continuous state-action space. Specifically, we establish finite-time convergence by introducing a novel Lyapunov analysis framework, which provides a unified convergence characterization of both the actor and the critic. Our approach is less conservative than previous methods and offers new insights into the coupled dynamics of actor-critic updates.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-chen25co, title = {On the Convergence of Continuous Single-timescale Actor-critic}, author = {Chen, Xuyang and Zhao, Lin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {9829--9859}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/chen25co/chen25co.pdf}, url = {https://proceedings.mlr.press/v267/chen25co.html}, abstract = {Actor-critic algorithms have been instrumental in boosting the performance of numerous challenging applications involving continuous control, such as highly robust and agile robot motion control. However, their theoretical understanding remains largely underdeveloped. Existing analyses mostly focus on finite state-action spaces and on simplified variants of actor-critic, such as double-loop updates with i.i.d. sampling, which are often impractical for real-world applications. We consider the canonical and widely adopted single-timescale updates with Markovian sampling in continuous state-action space. Specifically, we establish finite-time convergence by introducing a novel Lyapunov analysis framework, which provides a unified convergence characterization of both the actor and the critic. Our approach is less conservative than previous methods and offers new insights into the coupled dynamics of actor-critic updates.} }
Endnote
%0 Conference Paper %T On the Convergence of Continuous Single-timescale Actor-critic %A Xuyang Chen %A Lin Zhao %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-chen25co %I PMLR %P 9829--9859 %U https://proceedings.mlr.press/v267/chen25co.html %V 267 %X Actor-critic algorithms have been instrumental in boosting the performance of numerous challenging applications involving continuous control, such as highly robust and agile robot motion control. However, their theoretical understanding remains largely underdeveloped. Existing analyses mostly focus on finite state-action spaces and on simplified variants of actor-critic, such as double-loop updates with i.i.d. sampling, which are often impractical for real-world applications. We consider the canonical and widely adopted single-timescale updates with Markovian sampling in continuous state-action space. Specifically, we establish finite-time convergence by introducing a novel Lyapunov analysis framework, which provides a unified convergence characterization of both the actor and the critic. Our approach is less conservative than previous methods and offers new insights into the coupled dynamics of actor-critic updates.
APA
Chen, X. & Zhao, L.. (2025). On the Convergence of Continuous Single-timescale Actor-critic. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:9829-9859 Available from https://proceedings.mlr.press/v267/chen25co.html.

Related Material