Multi-Agent Learning in Contextual Games under Unknown Constraints

Anna M. Maddux; Maryam Kamgarpour

Multi-Agent Learning in Contextual Games under Unknown Constraints

Anna M. Maddux, Maryam Kamgarpour

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3142-3150, 2024.

Abstract

We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent’s action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents’ policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach that exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds, and the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition, we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that $\epsilon$-c.z.CCEs can be approached whenever players follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-maddux24a,
  title = 	 {Multi-Agent Learning in Contextual Games under Unknown Constraints},
  author =       {Maddux, Anna M. and Kamgarpour, Maryam},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3142--3150},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/maddux24a/maddux24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/maddux24a.html},
  abstract = 	 {We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent’s action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents’ policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach that exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds, and the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition, we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that $\epsilon$-c.z.CCEs can be approached whenever players follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning.}
}

Endnote

%0 Conference Paper
%T Multi-Agent Learning in Contextual Games under Unknown Constraints
%A Anna M. Maddux
%A Maryam Kamgarpour
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-maddux24a
%I PMLR
%P 3142--3150
%U https://proceedings.mlr.press/v238/maddux24a.html
%V 238
%X We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent’s action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents’ policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach that exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds, and the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition, we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that $\epsilon$-c.z.CCEs can be approached whenever players follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning.

APA

Maddux, A.M. & Kamgarpour, M.. (2024). Multi-Agent Learning in Contextual Games under Unknown Constraints. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3142-3150 Available from https://proceedings.mlr.press/v238/maddux24a.html.

Multi-Agent Learning in Contextual Games under Unknown Constraints

Abstract

Cite this Paper

Related Material