GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning

Haoyuan Qin, Zhengzhu Liu, Chenxing Lin, Chennan Ma, Songzhu Mei, Siqi Shen, Cheng Wang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:50246-50268, 2025.

Abstract

Parameter-sharing (PS) techniques have been widely adopted in cooperative Multi-Agent Reinforcement Learning (MARL). In PS, all the agents share a policy network with identical parameters, which enjoys good sample efficiency. However, PS could lead to homogeneous policies that limit MARL performance. We tackle this problem from the angle of gradient conflict among agents. We find that the existence of futile neurons whose update is canceled out by gradient conflicts among agents leads to poor learning efficiency and diversity. To address this deficiency, we propose GradPS, a gradient-based PS method. It dynamically creates multiple clones for each futile neuron. For each clone, a group of agents with low gradient-conflict shares the neuron’s parameters. Our method can enjoy good sample efficiency by sharing the gradients among agents of the same clone neuron. Moreover, it can encourage diverse behaviors through independently updating an exclusive clone neuron. Through extensive experiments, we show that GradPS can learn diverse policies with promising performance. The source code for GradPS is available in https://github.com/xmu-rl-3dv/GradPS.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-qin25c, title = {{G}rad{PS}: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning}, author = {Qin, Haoyuan and Liu, Zhengzhu and Lin, Chenxing and Ma, Chennan and Mei, Songzhu and Shen, Siqi and Wang, Cheng}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {50246--50268}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/qin25c/qin25c.pdf}, url = {https://proceedings.mlr.press/v267/qin25c.html}, abstract = {Parameter-sharing (PS) techniques have been widely adopted in cooperative Multi-Agent Reinforcement Learning (MARL). In PS, all the agents share a policy network with identical parameters, which enjoys good sample efficiency. However, PS could lead to homogeneous policies that limit MARL performance. We tackle this problem from the angle of gradient conflict among agents. We find that the existence of futile neurons whose update is canceled out by gradient conflicts among agents leads to poor learning efficiency and diversity. To address this deficiency, we propose GradPS, a gradient-based PS method. It dynamically creates multiple clones for each futile neuron. For each clone, a group of agents with low gradient-conflict shares the neuron’s parameters. Our method can enjoy good sample efficiency by sharing the gradients among agents of the same clone neuron. Moreover, it can encourage diverse behaviors through independently updating an exclusive clone neuron. Through extensive experiments, we show that GradPS can learn diverse policies with promising performance. The source code for GradPS is available in https://github.com/xmu-rl-3dv/GradPS.} }
Endnote
%0 Conference Paper %T GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning %A Haoyuan Qin %A Zhengzhu Liu %A Chenxing Lin %A Chennan Ma %A Songzhu Mei %A Siqi Shen %A Cheng Wang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-qin25c %I PMLR %P 50246--50268 %U https://proceedings.mlr.press/v267/qin25c.html %V 267 %X Parameter-sharing (PS) techniques have been widely adopted in cooperative Multi-Agent Reinforcement Learning (MARL). In PS, all the agents share a policy network with identical parameters, which enjoys good sample efficiency. However, PS could lead to homogeneous policies that limit MARL performance. We tackle this problem from the angle of gradient conflict among agents. We find that the existence of futile neurons whose update is canceled out by gradient conflicts among agents leads to poor learning efficiency and diversity. To address this deficiency, we propose GradPS, a gradient-based PS method. It dynamically creates multiple clones for each futile neuron. For each clone, a group of agents with low gradient-conflict shares the neuron’s parameters. Our method can enjoy good sample efficiency by sharing the gradients among agents of the same clone neuron. Moreover, it can encourage diverse behaviors through independently updating an exclusive clone neuron. Through extensive experiments, we show that GradPS can learn diverse policies with promising performance. The source code for GradPS is available in https://github.com/xmu-rl-3dv/GradPS.
APA
Qin, H., Liu, Z., Lin, C., Ma, C., Mei, S., Shen, S. & Wang, C.. (2025). GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:50246-50268 Available from https://proceedings.mlr.press/v267/qin25c.html.

Related Material