Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning

Yiran Wang, Chenshu Liu, Yunfan Li, Sanae Amani, Bolei Zhou, Lin Yang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:63707-63733, 2025.

Abstract

The exploration & exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, which heavily limits the applicability and accessibility of this line of methods. In this paper, we characterize this problem via analysis of the agent behavior, concluding the fundamental difficulty of choosing a proper hyperparameter. We then identify the difficulty and the instability of the optimization when the agent learns with curiosity. We propose our method, hyperparameter robust exploration (Hyper), which extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training. We theoretically justify that Hyper is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wang25bo, title = {Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning}, author = {Wang, Yiran and Liu, Chenshu and Li, Yunfan and Amani, Sanae and Zhou, Bolei and Yang, Lin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {63707--63733}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wang25bo/wang25bo.pdf}, url = {https://proceedings.mlr.press/v267/wang25bo.html}, abstract = {The exploration & exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, which heavily limits the applicability and accessibility of this line of methods. In this paper, we characterize this problem via analysis of the agent behavior, concluding the fundamental difficulty of choosing a proper hyperparameter. We then identify the difficulty and the instability of the optimization when the agent learns with curiosity. We propose our method, hyperparameter robust exploration (Hyper), which extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training. We theoretically justify that Hyper is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.} }
Endnote
%0 Conference Paper %T Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning %A Yiran Wang %A Chenshu Liu %A Yunfan Li %A Sanae Amani %A Bolei Zhou %A Lin Yang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wang25bo %I PMLR %P 63707--63733 %U https://proceedings.mlr.press/v267/wang25bo.html %V 267 %X The exploration & exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, which heavily limits the applicability and accessibility of this line of methods. In this paper, we characterize this problem via analysis of the agent behavior, concluding the fundamental difficulty of choosing a proper hyperparameter. We then identify the difficulty and the instability of the optimization when the agent learns with curiosity. We propose our method, hyperparameter robust exploration (Hyper), which extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training. We theoretically justify that Hyper is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.
APA
Wang, Y., Liu, C., Li, Y., Amani, S., Zhou, B. & Yang, L.. (2025). Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:63707-63733 Available from https://proceedings.mlr.press/v267/wang25bo.html.

Related Material