Offline Reinforcement Learning at Multiple Frequencies

Kaylee Burns, Tianhe Yu, Chelsea Finn, Karol Hausman
Proceedings of The 6th Conference on Robot Learning, PMLR 205:2041-2051, 2023.

Abstract

To leverage many sources of offline robot data, robots must grapple with the heterogeneity of such data. In this paper, we focus on one particular aspect of this challenge: learning from offline data collected at different control frequencies. Across labs, the discretization of controllers, sampling rates of sensors, and demands of a task of interest may differ, giving rise to a mixture of frequencies in an aggregated dataset. We study how well offline reinforcement learning (RL) algorithms can accommodate data with a mixture of frequencies during training. We observe that the $Q$-value propagates at different rates for different discretizations, leading to a number of learning challenges for off-the-shelf offline RL algorithms. We present a simple yet effective solution that enforces consistency in the rate of $Q$-value updates to stabilize learning. By scaling the value of $N$ in $N$-step returns with the discretization size, we effectively balance $Q$-value propagation, leading to more stable convergence. On three simulated robotic control problems, we empirically find that this simple approach significantly outperforms naïve mixing both terms of absolute performance and training stability, while also improving over using only the data from a single control frequency.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-burns23a, title = {Offline Reinforcement Learning at Multiple Frequencies}, author = {Burns, Kaylee and Yu, Tianhe and Finn, Chelsea and Hausman, Karol}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {2041--2051}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/burns23a/burns23a.pdf}, url = {https://proceedings.mlr.press/v205/burns23a.html}, abstract = {To leverage many sources of offline robot data, robots must grapple with the heterogeneity of such data. In this paper, we focus on one particular aspect of this challenge: learning from offline data collected at different control frequencies. Across labs, the discretization of controllers, sampling rates of sensors, and demands of a task of interest may differ, giving rise to a mixture of frequencies in an aggregated dataset. We study how well offline reinforcement learning (RL) algorithms can accommodate data with a mixture of frequencies during training. We observe that the $Q$-value propagates at different rates for different discretizations, leading to a number of learning challenges for off-the-shelf offline RL algorithms. We present a simple yet effective solution that enforces consistency in the rate of $Q$-value updates to stabilize learning. By scaling the value of $N$ in $N$-step returns with the discretization size, we effectively balance $Q$-value propagation, leading to more stable convergence. On three simulated robotic control problems, we empirically find that this simple approach significantly outperforms naïve mixing both terms of absolute performance and training stability, while also improving over using only the data from a single control frequency.} }
Endnote
%0 Conference Paper %T Offline Reinforcement Learning at Multiple Frequencies %A Kaylee Burns %A Tianhe Yu %A Chelsea Finn %A Karol Hausman %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-burns23a %I PMLR %P 2041--2051 %U https://proceedings.mlr.press/v205/burns23a.html %V 205 %X To leverage many sources of offline robot data, robots must grapple with the heterogeneity of such data. In this paper, we focus on one particular aspect of this challenge: learning from offline data collected at different control frequencies. Across labs, the discretization of controllers, sampling rates of sensors, and demands of a task of interest may differ, giving rise to a mixture of frequencies in an aggregated dataset. We study how well offline reinforcement learning (RL) algorithms can accommodate data with a mixture of frequencies during training. We observe that the $Q$-value propagates at different rates for different discretizations, leading to a number of learning challenges for off-the-shelf offline RL algorithms. We present a simple yet effective solution that enforces consistency in the rate of $Q$-value updates to stabilize learning. By scaling the value of $N$ in $N$-step returns with the discretization size, we effectively balance $Q$-value propagation, leading to more stable convergence. On three simulated robotic control problems, we empirically find that this simple approach significantly outperforms naïve mixing both terms of absolute performance and training stability, while also improving over using only the data from a single control frequency.
APA
Burns, K., Yu, T., Finn, C. & Hausman, K.. (2023). Offline Reinforcement Learning at Multiple Frequencies. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:2041-2051 Available from https://proceedings.mlr.press/v205/burns23a.html.

Related Material