Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning

Matthias Gerstgrasser, David C. Parkes
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:11213-11236, 2023.

Abstract

Stackelberg equilibria arise naturally in a range of popular learning problems, such as in security games or indirect mechanism design, and have received increasing attention in the reinforcement learning literature. We present a general framework for implementing Stackelberg equilibria search as a multi-agent RL problem, allowing a wide range of algorithmic design choices. We discuss how previous approaches can be seen as specific instantiations of this framework. As a key insight, we note that the design space allows for approaches not previously seen in the literature, for instance by leveraging multitask and meta-RL techniques for follower convergence. We propose one such approach using contextual policies, and evaluate it experimentally on both standard and novel benchmark domains, showing greatly improved sample efficiency compared to previous approaches. Finally, we explore the effect of adopting algorithm designs outside the borders of our framework.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-gerstgrasser23a, title = {Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning}, author = {Gerstgrasser, Matthias and Parkes, David C.}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {11213--11236}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/gerstgrasser23a/gerstgrasser23a.pdf}, url = {https://proceedings.mlr.press/v202/gerstgrasser23a.html}, abstract = {Stackelberg equilibria arise naturally in a range of popular learning problems, such as in security games or indirect mechanism design, and have received increasing attention in the reinforcement learning literature. We present a general framework for implementing Stackelberg equilibria search as a multi-agent RL problem, allowing a wide range of algorithmic design choices. We discuss how previous approaches can be seen as specific instantiations of this framework. As a key insight, we note that the design space allows for approaches not previously seen in the literature, for instance by leveraging multitask and meta-RL techniques for follower convergence. We propose one such approach using contextual policies, and evaluate it experimentally on both standard and novel benchmark domains, showing greatly improved sample efficiency compared to previous approaches. Finally, we explore the effect of adopting algorithm designs outside the borders of our framework.} }
Endnote
%0 Conference Paper %T Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning %A Matthias Gerstgrasser %A David C. Parkes %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-gerstgrasser23a %I PMLR %P 11213--11236 %U https://proceedings.mlr.press/v202/gerstgrasser23a.html %V 202 %X Stackelberg equilibria arise naturally in a range of popular learning problems, such as in security games or indirect mechanism design, and have received increasing attention in the reinforcement learning literature. We present a general framework for implementing Stackelberg equilibria search as a multi-agent RL problem, allowing a wide range of algorithmic design choices. We discuss how previous approaches can be seen as specific instantiations of this framework. As a key insight, we note that the design space allows for approaches not previously seen in the literature, for instance by leveraging multitask and meta-RL techniques for follower convergence. We propose one such approach using contextual policies, and evaluate it experimentally on both standard and novel benchmark domains, showing greatly improved sample efficiency compared to previous approaches. Finally, we explore the effect of adopting algorithm designs outside the borders of our framework.
APA
Gerstgrasser, M. & Parkes, D.C.. (2023). Oracles & Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:11213-11236 Available from https://proceedings.mlr.press/v202/gerstgrasser23a.html.

Related Material