Impact of Decentralized Learning on Player Utilities in Stackelberg Games

Kate Donahue, Nicole Immorlica, Meena Jagadeesan, Brendan Lucier, Aleksandrs Slivkins
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:11253-11310, 2024.

Abstract

When deployed in the world, a learning agent such as a recommender system or a chatbot often repeatedly interacts with another learning agent (such as a user) over time. In many such two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned. To better understand such cases, we examine the learning dynamics of the two-agent system and the implications for each agent’s objective. We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks (such as Stackelberg equilibrium payoffs) result in worst-case linear regret for at least one player. To better capture these systems, we construct a relaxed regret benchmark that is tolerant to small learning errors by agents. We show that standard learning algorithms fail to provide sublinear regret, and we develop algorithms to achieve near-optimal $\mathcal{O}(T^{2/3})$ regret for both players with respect to these benchmarks. We further design relaxed environments under which faster learning ($\mathcal{O}(\sqrt{T})$) is possible. Altogether, our results take a step towards assessing how two-agent interactions in sequential and decentralized learning environments affect the utility of both agents.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-donahue24a, title = {Impact of Decentralized Learning on Player Utilities in Stackelberg Games}, author = {Donahue, Kate and Immorlica, Nicole and Jagadeesan, Meena and Lucier, Brendan and Slivkins, Aleksandrs}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {11253--11310}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/donahue24a/donahue24a.pdf}, url = {https://proceedings.mlr.press/v235/donahue24a.html}, abstract = {When deployed in the world, a learning agent such as a recommender system or a chatbot often repeatedly interacts with another learning agent (such as a user) over time. In many such two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned. To better understand such cases, we examine the learning dynamics of the two-agent system and the implications for each agent’s objective. We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks (such as Stackelberg equilibrium payoffs) result in worst-case linear regret for at least one player. To better capture these systems, we construct a relaxed regret benchmark that is tolerant to small learning errors by agents. We show that standard learning algorithms fail to provide sublinear regret, and we develop algorithms to achieve near-optimal $\mathcal{O}(T^{2/3})$ regret for both players with respect to these benchmarks. We further design relaxed environments under which faster learning ($\mathcal{O}(\sqrt{T})$) is possible. Altogether, our results take a step towards assessing how two-agent interactions in sequential and decentralized learning environments affect the utility of both agents.} }
Endnote
%0 Conference Paper %T Impact of Decentralized Learning on Player Utilities in Stackelberg Games %A Kate Donahue %A Nicole Immorlica %A Meena Jagadeesan %A Brendan Lucier %A Aleksandrs Slivkins %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-donahue24a %I PMLR %P 11253--11310 %U https://proceedings.mlr.press/v235/donahue24a.html %V 235 %X When deployed in the world, a learning agent such as a recommender system or a chatbot often repeatedly interacts with another learning agent (such as a user) over time. In many such two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned. To better understand such cases, we examine the learning dynamics of the two-agent system and the implications for each agent’s objective. We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks (such as Stackelberg equilibrium payoffs) result in worst-case linear regret for at least one player. To better capture these systems, we construct a relaxed regret benchmark that is tolerant to small learning errors by agents. We show that standard learning algorithms fail to provide sublinear regret, and we develop algorithms to achieve near-optimal $\mathcal{O}(T^{2/3})$ regret for both players with respect to these benchmarks. We further design relaxed environments under which faster learning ($\mathcal{O}(\sqrt{T})$) is possible. Altogether, our results take a step towards assessing how two-agent interactions in sequential and decentralized learning environments affect the utility of both agents.
APA
Donahue, K., Immorlica, N., Jagadeesan, M., Lucier, B. & Slivkins, A.. (2024). Impact of Decentralized Learning on Player Utilities in Stackelberg Games. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:11253-11310 Available from https://proceedings.mlr.press/v235/donahue24a.html.

Related Material