Privacy-Preserving Decentralized Actor-Critic for Cooperative Multi-Agent Reinforcement Learning

Maheed H Ahmed, Mahsa Ghasemi
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:2755-2763, 2024.

Abstract

Multi-agent reinforcement learning has a wide range of applications in cooperative settings, but ensuring data privacy among agents is a significant challenge. To address this challenge, we propose Privacy-Preserving Decentralized Actor-Critic (PPDAC), an algorithm that motivates agents to cooperate while maintaining their data privacy. Leveraging trajectory ranking, PPDAC enables the agents to learn a cooperation reward that encourages agents to account for other agents’ preferences. Subsequently, each agent trains a policy that maximizes not only its local reward as in independent actor-critic (IAC) but also the cooperation reward, hence, increasing cooperation. Importantly, communication among agents is restricted to their ranking of trajectories that only include public identifiers without any private local data. Moreover, as an additional layer of privacy, the agents can perturb their rankings with the randomized response method. We evaluate PPDAC on the level-based foraging (LBF) environment and a coin-gathering environment. We compare with IAC and Shared Experience Actor-Critic (SEAC) which achieves SOTA results for the LBF environment. The results show that PPDAC consistently outperforms IAC. In addition, PPDAC outperforms SEAC in the coin-gathering environment and achieves similar performance in the LBF environment, all while providing better privacy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-h-ahmed24a, title = { Privacy-Preserving Decentralized Actor-Critic for Cooperative Multi-Agent Reinforcement Learning }, author = {H Ahmed, Maheed and Ghasemi, Mahsa}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {2755--2763}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/h-ahmed24a/h-ahmed24a.pdf}, url = {https://proceedings.mlr.press/v238/h-ahmed24a.html}, abstract = { Multi-agent reinforcement learning has a wide range of applications in cooperative settings, but ensuring data privacy among agents is a significant challenge. To address this challenge, we propose Privacy-Preserving Decentralized Actor-Critic (PPDAC), an algorithm that motivates agents to cooperate while maintaining their data privacy. Leveraging trajectory ranking, PPDAC enables the agents to learn a cooperation reward that encourages agents to account for other agents’ preferences. Subsequently, each agent trains a policy that maximizes not only its local reward as in independent actor-critic (IAC) but also the cooperation reward, hence, increasing cooperation. Importantly, communication among agents is restricted to their ranking of trajectories that only include public identifiers without any private local data. Moreover, as an additional layer of privacy, the agents can perturb their rankings with the randomized response method. We evaluate PPDAC on the level-based foraging (LBF) environment and a coin-gathering environment. We compare with IAC and Shared Experience Actor-Critic (SEAC) which achieves SOTA results for the LBF environment. The results show that PPDAC consistently outperforms IAC. In addition, PPDAC outperforms SEAC in the coin-gathering environment and achieves similar performance in the LBF environment, all while providing better privacy. } }
Endnote
%0 Conference Paper %T Privacy-Preserving Decentralized Actor-Critic for Cooperative Multi-Agent Reinforcement Learning %A Maheed H Ahmed %A Mahsa Ghasemi %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-h-ahmed24a %I PMLR %P 2755--2763 %U https://proceedings.mlr.press/v238/h-ahmed24a.html %V 238 %X Multi-agent reinforcement learning has a wide range of applications in cooperative settings, but ensuring data privacy among agents is a significant challenge. To address this challenge, we propose Privacy-Preserving Decentralized Actor-Critic (PPDAC), an algorithm that motivates agents to cooperate while maintaining their data privacy. Leveraging trajectory ranking, PPDAC enables the agents to learn a cooperation reward that encourages agents to account for other agents’ preferences. Subsequently, each agent trains a policy that maximizes not only its local reward as in independent actor-critic (IAC) but also the cooperation reward, hence, increasing cooperation. Importantly, communication among agents is restricted to their ranking of trajectories that only include public identifiers without any private local data. Moreover, as an additional layer of privacy, the agents can perturb their rankings with the randomized response method. We evaluate PPDAC on the level-based foraging (LBF) environment and a coin-gathering environment. We compare with IAC and Shared Experience Actor-Critic (SEAC) which achieves SOTA results for the LBF environment. The results show that PPDAC consistently outperforms IAC. In addition, PPDAC outperforms SEAC in the coin-gathering environment and achieves similar performance in the LBF environment, all while providing better privacy.
APA
H Ahmed, M. & Ghasemi, M.. (2024). Privacy-Preserving Decentralized Actor-Critic for Cooperative Multi-Agent Reinforcement Learning . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:2755-2763 Available from https://proceedings.mlr.press/v238/h-ahmed24a.html.

Related Material