Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts

Onur Celik, Aleksandar Taranovic, Gerhard Neumann
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:5907-5933, 2024.

Abstract

Reinforcement learning (RL) is a powerful approach for acquiring a good-performing policy. However, learning diverse skills is challenging in RL due to the commonly used Gaussian policy parameterization. We propose Diverse Skill Learning (Di-SkilL), an RL method for learning diverse skills using Mixture of Experts, where each expert formalizes a skill as a contextual motion primitive. Di-SkilL optimizes each expert and its associate context distribution to a maximum entropy objective that incentivizes learning diverse skills in similar contexts. The per-expert context distribution enables automatic curricula learning, allowing each expert to focus on its best-performing sub-region of the context space. To overcome hard discontinuities and multi-modalities without any prior knowledge of the environment’s unknown context probability space, we leverage energy-based models to represent the per-expert context distributions and demonstrate how we can efficiently train them using the standard policy gradient objective. We show on challenging robot simulation tasks that Di-SkilL can learn diverse and performant skills.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-celik24a, title = {Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts}, author = {Celik, Onur and Taranovic, Aleksandar and Neumann, Gerhard}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {5907--5933}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/celik24a/celik24a.pdf}, url = {https://proceedings.mlr.press/v235/celik24a.html}, abstract = {Reinforcement learning (RL) is a powerful approach for acquiring a good-performing policy. However, learning diverse skills is challenging in RL due to the commonly used Gaussian policy parameterization. We propose Diverse Skill Learning (Di-SkilL), an RL method for learning diverse skills using Mixture of Experts, where each expert formalizes a skill as a contextual motion primitive. Di-SkilL optimizes each expert and its associate context distribution to a maximum entropy objective that incentivizes learning diverse skills in similar contexts. The per-expert context distribution enables automatic curricula learning, allowing each expert to focus on its best-performing sub-region of the context space. To overcome hard discontinuities and multi-modalities without any prior knowledge of the environment’s unknown context probability space, we leverage energy-based models to represent the per-expert context distributions and demonstrate how we can efficiently train them using the standard policy gradient objective. We show on challenging robot simulation tasks that Di-SkilL can learn diverse and performant skills.} }
Endnote
%0 Conference Paper %T Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts %A Onur Celik %A Aleksandar Taranovic %A Gerhard Neumann %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-celik24a %I PMLR %P 5907--5933 %U https://proceedings.mlr.press/v235/celik24a.html %V 235 %X Reinforcement learning (RL) is a powerful approach for acquiring a good-performing policy. However, learning diverse skills is challenging in RL due to the commonly used Gaussian policy parameterization. We propose Diverse Skill Learning (Di-SkilL), an RL method for learning diverse skills using Mixture of Experts, where each expert formalizes a skill as a contextual motion primitive. Di-SkilL optimizes each expert and its associate context distribution to a maximum entropy objective that incentivizes learning diverse skills in similar contexts. The per-expert context distribution enables automatic curricula learning, allowing each expert to focus on its best-performing sub-region of the context space. To overcome hard discontinuities and multi-modalities without any prior knowledge of the environment’s unknown context probability space, we leverage energy-based models to represent the per-expert context distributions and demonstrate how we can efficiently train them using the standard policy gradient objective. We show on challenging robot simulation tasks that Di-SkilL can learn diverse and performant skills.
APA
Celik, O., Taranovic, A. & Neumann, G.. (2024). Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:5907-5933 Available from https://proceedings.mlr.press/v235/celik24a.html.

Related Material