Multi-Agent Learning in Contextual Games under Unknown Constraints

Anna M. Maddux, Maryam Kamgarpour
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3142-3150, 2024.

Abstract

We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent’s action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents’ policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach that exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds, and the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition, we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that $\epsilon$-c.z.CCEs can be approached whenever players follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-maddux24a, title = {Multi-Agent Learning in Contextual Games under Unknown Constraints}, author = {Maddux, Anna M. and Kamgarpour, Maryam}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3142--3150}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/maddux24a/maddux24a.pdf}, url = {https://proceedings.mlr.press/v238/maddux24a.html}, abstract = {We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent’s action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents’ policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach that exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds, and the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition, we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that $\epsilon$-c.z.CCEs can be approached whenever players follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning.} }
Endnote
%0 Conference Paper %T Multi-Agent Learning in Contextual Games under Unknown Constraints %A Anna M. Maddux %A Maryam Kamgarpour %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-maddux24a %I PMLR %P 3142--3150 %U https://proceedings.mlr.press/v238/maddux24a.html %V 238 %X We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent’s action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents’ policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach that exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds, and the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition, we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that $\epsilon$-c.z.CCEs can be approached whenever players follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning.
APA
Maddux, A.M. & Kamgarpour, M.. (2024). Multi-Agent Learning in Contextual Games under Unknown Constraints. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3142-3150 Available from https://proceedings.mlr.press/v238/maddux24a.html.

Related Material