Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks

Saptarshi Nath, Christos Peridis, Eseoghene Ben-Iwhiwhu, Xinran Liu, Shirin Dora, Cong Liu, Soheil Kolouri, Andrea Soltoggio
Proceedings of The 2nd Conference on Lifelong Learning Agents, PMLR 232:936-960, 2023.

Abstract

Lifelong learning agents aim to learn multiple tasks sequentially over a lifetime. This involves the ability to exploit previous knowledge when learning new tasks and to avoid forgetting. Recently, modulating masks, a specific type of parameter isolation approach, have shown promise in both supervised and reinforcement learning. While lifelong learning algorithms have been investigated mainly within a single-agent approach, a question remains on how multiple agents can share lifelong learning knowledge with each other. We show that the parameter isolation mechanism used by modulating masks is particularly suitable for exchanging knowledge among agents in a distributed and decentralized system of lifelong learners. The key idea is that isolating specific task knowledge to specific masks allows agents to transfer only specific knowledge on-demand, resulting in a robust and effective collective of agents. We assume fully distributed and asynchronous scenarios with dynamic agent numbers and connectivity. An on-demand communication protocol ensures agents query their peers for specific masks to be transferred and integrated into their policies when facing each task. Experiments indicate that on-demand mask communication is an effective way to implement distributed and decentralized lifelong reinforcement learning, and provides a lifelong learning benefit with respect to distributed RL baselines such as DD-PPO, IMPALA, and PPO+EWC. The system is particularly robust to connection drops and demonstrates rapid learning due to knowledge exchange.

Cite this Paper


BibTeX
@InProceedings{pmlr-v232-nath23a, title = {Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks}, author = {Nath, Saptarshi and Peridis, Christos and Ben-Iwhiwhu, Eseoghene and Liu, Xinran and Dora, Shirin and Liu, Cong and Kolouri, Soheil and Soltoggio, Andrea}, booktitle = {Proceedings of The 2nd Conference on Lifelong Learning Agents}, pages = {936--960}, year = {2023}, editor = {Chandar, Sarath and Pascanu, Razvan and Sedghi, Hanie and Precup, Doina}, volume = {232}, series = {Proceedings of Machine Learning Research}, month = {22--25 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v232/nath23a/nath23a.pdf}, url = {https://proceedings.mlr.press/v232/nath23a.html}, abstract = {Lifelong learning agents aim to learn multiple tasks sequentially over a lifetime. This involves the ability to exploit previous knowledge when learning new tasks and to avoid forgetting. Recently, modulating masks, a specific type of parameter isolation approach, have shown promise in both supervised and reinforcement learning. While lifelong learning algorithms have been investigated mainly within a single-agent approach, a question remains on how multiple agents can share lifelong learning knowledge with each other. We show that the parameter isolation mechanism used by modulating masks is particularly suitable for exchanging knowledge among agents in a distributed and decentralized system of lifelong learners. The key idea is that isolating specific task knowledge to specific masks allows agents to transfer only specific knowledge on-demand, resulting in a robust and effective collective of agents. We assume fully distributed and asynchronous scenarios with dynamic agent numbers and connectivity. An on-demand communication protocol ensures agents query their peers for specific masks to be transferred and integrated into their policies when facing each task. Experiments indicate that on-demand mask communication is an effective way to implement distributed and decentralized lifelong reinforcement learning, and provides a lifelong learning benefit with respect to distributed RL baselines such as DD-PPO, IMPALA, and PPO+EWC. The system is particularly robust to connection drops and demonstrates rapid learning due to knowledge exchange.} }
Endnote
%0 Conference Paper %T Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks %A Saptarshi Nath %A Christos Peridis %A Eseoghene Ben-Iwhiwhu %A Xinran Liu %A Shirin Dora %A Cong Liu %A Soheil Kolouri %A Andrea Soltoggio %B Proceedings of The 2nd Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2023 %E Sarath Chandar %E Razvan Pascanu %E Hanie Sedghi %E Doina Precup %F pmlr-v232-nath23a %I PMLR %P 936--960 %U https://proceedings.mlr.press/v232/nath23a.html %V 232 %X Lifelong learning agents aim to learn multiple tasks sequentially over a lifetime. This involves the ability to exploit previous knowledge when learning new tasks and to avoid forgetting. Recently, modulating masks, a specific type of parameter isolation approach, have shown promise in both supervised and reinforcement learning. While lifelong learning algorithms have been investigated mainly within a single-agent approach, a question remains on how multiple agents can share lifelong learning knowledge with each other. We show that the parameter isolation mechanism used by modulating masks is particularly suitable for exchanging knowledge among agents in a distributed and decentralized system of lifelong learners. The key idea is that isolating specific task knowledge to specific masks allows agents to transfer only specific knowledge on-demand, resulting in a robust and effective collective of agents. We assume fully distributed and asynchronous scenarios with dynamic agent numbers and connectivity. An on-demand communication protocol ensures agents query their peers for specific masks to be transferred and integrated into their policies when facing each task. Experiments indicate that on-demand mask communication is an effective way to implement distributed and decentralized lifelong reinforcement learning, and provides a lifelong learning benefit with respect to distributed RL baselines such as DD-PPO, IMPALA, and PPO+EWC. The system is particularly robust to connection drops and demonstrates rapid learning due to knowledge exchange.
APA
Nath, S., Peridis, C., Ben-Iwhiwhu, E., Liu, X., Dora, S., Liu, C., Kolouri, S. & Soltoggio, A.. (2023). Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks. Proceedings of The 2nd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 232:936-960 Available from https://proceedings.mlr.press/v232/nath23a.html.

Related Material