[edit]
AIIR-MIX: Multi-Agent Reinforcement Learning Meets Attention Individual Intrinsic Reward Mixing Network
Proceedings of The 14th Asian Conference on Machine
Learning, PMLR 189:579-594, 2023.
Abstract
Deducing the contribution of each agent and
assigning the corresponding reward to them is a
crucial problem in cooperative Multi-Agent
Reinforcement Learning (MARL). Previous studies try
to resolve the issue through designing an intrinsic
reward function, but the intrinsic reward is simply
combined with the environment reward by summation in
these studies, which makes the performance of their
MARL framework unsatisfactory. We propose a novel
method named Attention Individual Intrinsic Reward
Mixing Network (AIIR-MIX) in MARL, and the
contributions of AIIR-MIX are listed as follows:
\textbf{(a)} we construct a novel intrinsic reward
network based on the attention mechanism to make
teamwork more effective. \textbf{(b)} we propose a
Mixing network that is able to combine intrinsic and
extrinsic rewards non-linearly and dynamically in
response to changing conditions of the
environment. We compare AIIR-MIX with many
State-Of-The-Art (SOTA) MARL methods on battle games
in StarCraft II. And the results demonstrate that
AIIR-MIX performs admirably and can defeat the
current advanced methods on average test win
rate. To validate the effectiveness of AIIR-MIX, we
conduct additional ablation studies. The results
show that AIIR-MIX can dynamically assign each agent
a real-time intrinsic reward in accordance with
their actual contribution.