ARS: Adaptive Reward Scaling for Multi-Task Reinforcement Learning

Myungsik Cho, Jongeui Park, Jeonghye Kim, Youngchul Sung
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:10583-10601, 2025.

Abstract

Multi-task reinforcement learning (RL) encounters significant challenges due to varying task complexities and their reward distributions from the environment. To address these issues, in this paper, we propose Adaptive Reward Scaling (ARS), a novel framework that dynamically adjusts reward magnitudes and leverages a periodic network reset mechanism. ARS introduces a history-based reward scaling strategy that ensures balanced reward distributions across tasks, enabling stable and efficient training. The reset mechanism complements this approach by mitigating overfitting and ensuring robust convergence. Empirical evaluations on the Meta-World benchmark demonstrate that ARS significantly outperforms baseline methods, achieving superior performance on challenging tasks while maintaining overall learning efficiency. These results validate ARS’s effectiveness in tackling diverse multi-task RL problems, paving the way for scalable solutions in complex real-world applications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-cho25d, title = {{ARS}: Adaptive Reward Scaling for Multi-Task Reinforcement Learning}, author = {Cho, Myungsik and Park, Jongeui and Kim, Jeonghye and Sung, Youngchul}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {10583--10601}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/cho25d/cho25d.pdf}, url = {https://proceedings.mlr.press/v267/cho25d.html}, abstract = {Multi-task reinforcement learning (RL) encounters significant challenges due to varying task complexities and their reward distributions from the environment. To address these issues, in this paper, we propose Adaptive Reward Scaling (ARS), a novel framework that dynamically adjusts reward magnitudes and leverages a periodic network reset mechanism. ARS introduces a history-based reward scaling strategy that ensures balanced reward distributions across tasks, enabling stable and efficient training. The reset mechanism complements this approach by mitigating overfitting and ensuring robust convergence. Empirical evaluations on the Meta-World benchmark demonstrate that ARS significantly outperforms baseline methods, achieving superior performance on challenging tasks while maintaining overall learning efficiency. These results validate ARS’s effectiveness in tackling diverse multi-task RL problems, paving the way for scalable solutions in complex real-world applications.} }
Endnote
%0 Conference Paper %T ARS: Adaptive Reward Scaling for Multi-Task Reinforcement Learning %A Myungsik Cho %A Jongeui Park %A Jeonghye Kim %A Youngchul Sung %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-cho25d %I PMLR %P 10583--10601 %U https://proceedings.mlr.press/v267/cho25d.html %V 267 %X Multi-task reinforcement learning (RL) encounters significant challenges due to varying task complexities and their reward distributions from the environment. To address these issues, in this paper, we propose Adaptive Reward Scaling (ARS), a novel framework that dynamically adjusts reward magnitudes and leverages a periodic network reset mechanism. ARS introduces a history-based reward scaling strategy that ensures balanced reward distributions across tasks, enabling stable and efficient training. The reset mechanism complements this approach by mitigating overfitting and ensuring robust convergence. Empirical evaluations on the Meta-World benchmark demonstrate that ARS significantly outperforms baseline methods, achieving superior performance on challenging tasks while maintaining overall learning efficiency. These results validate ARS’s effectiveness in tackling diverse multi-task RL problems, paving the way for scalable solutions in complex real-world applications.
APA
Cho, M., Park, J., Kim, J. & Sung, Y.. (2025). ARS: Adaptive Reward Scaling for Multi-Task Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:10583-10601 Available from https://proceedings.mlr.press/v267/cho25d.html.

Related Material