Coco-Q: Learning in Stochastic Games with Side Payments

Eric Sodomka, Elizabeth Hilliard, Michael Littman, Amy Greenwald
; Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1471-1479, 2013.

Abstract

Coco (""cooperative/competitive"") values are a solution concept for two-player normal-form games with transferable utility, when binding agreements and side payments between players are possible. In this paper, we show that coco values can also be defined for stochastic games and can be learned using a simple variant of Q-learning that is provably convergent. We provide a set of examples showing how the strategies learned by the Coco-Q algorithm relate to those learned by existing multiagent Q-learning algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-sodomka13, title = {Coco-Q: Learning in Stochastic Games with Side Payments}, author = {Eric Sodomka and Elizabeth Hilliard and Michael Littman and Amy Greenwald}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {1471--1479}, year = {2013}, editor = {Sanjoy Dasgupta and David McAllester}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/sodomka13.pdf}, url = {http://proceedings.mlr.press/v28/sodomka13.html}, abstract = {Coco (""cooperative/competitive"") values are a solution concept for two-player normal-form games with transferable utility, when binding agreements and side payments between players are possible. In this paper, we show that coco values can also be defined for stochastic games and can be learned using a simple variant of Q-learning that is provably convergent. We provide a set of examples showing how the strategies learned by the Coco-Q algorithm relate to those learned by existing multiagent Q-learning algorithms.} }
Endnote
%0 Conference Paper %T Coco-Q: Learning in Stochastic Games with Side Payments %A Eric Sodomka %A Elizabeth Hilliard %A Michael Littman %A Amy Greenwald %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-sodomka13 %I PMLR %J Proceedings of Machine Learning Research %P 1471--1479 %U http://proceedings.mlr.press %V 28 %N 3 %W PMLR %X Coco (""cooperative/competitive"") values are a solution concept for two-player normal-form games with transferable utility, when binding agreements and side payments between players are possible. In this paper, we show that coco values can also be defined for stochastic games and can be learned using a simple variant of Q-learning that is provably convergent. We provide a set of examples showing how the strategies learned by the Coco-Q algorithm relate to those learned by existing multiagent Q-learning algorithms.
RIS
TY - CPAPER TI - Coco-Q: Learning in Stochastic Games with Side Payments AU - Eric Sodomka AU - Elizabeth Hilliard AU - Michael Littman AU - Amy Greenwald BT - Proceedings of the 30th International Conference on Machine Learning PY - 2013/02/13 DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-sodomka13 PB - PMLR SP - 1471 DP - PMLR EP - 1479 L1 - http://proceedings.mlr.press/v28/sodomka13.pdf UR - http://proceedings.mlr.press/v28/sodomka13.html AB - Coco (""cooperative/competitive"") values are a solution concept for two-player normal-form games with transferable utility, when binding agreements and side payments between players are possible. In this paper, we show that coco values can also be defined for stochastic games and can be learned using a simple variant of Q-learning that is provably convergent. We provide a set of examples showing how the strategies learned by the Coco-Q algorithm relate to those learned by existing multiagent Q-learning algorithms. ER -
APA
Sodomka, E., Hilliard, E., Littman, M. & Greenwald, A.. (2013). Coco-Q: Learning in Stochastic Games with Side Payments. Proceedings of the 30th International Conference on Machine Learning, in PMLR 28(3):1471-1479

Related Material