Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:12167-12176, 2021.

Abstract

We study multi-objective reinforcement learning (RL) where an agent’s reward is represented as a vector. In settings where an agent competes against opponents, its performance is measured by the distance of its average return vector to a target set. We develop statistically and computationally efficient algorithms to approach the associated target set. Our results extend Blackwell’s approachability theorem \citep{blackwell1956analog} to tabular RL, where strategic exploration becomes essential. The algorithms presented are adaptive; their guarantees hold even without Blackwell’s approachability condition. If the opponents use fixed policies, we give an improved rate of approaching the target set while also tackling the more ambitious goal of simultaneously minimizing a scalar cost function. We discuss our analysis for this special case by relating our results to previous works on constrained RL. To our knowledge, this work provides the first provably efficient algorithms for vector-valued Markov games and our theoretical guarantees are near-optimal.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-yu21b, title = {Provably Efficient Algorithms for Multi-Objective Competitive RL}, author = {Yu, Tiancheng and Tian, Yi and Zhang, Jingzhao and Sra, Suvrit}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {12167--12176}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/yu21b/yu21b.pdf}, url = {https://proceedings.mlr.press/v139/yu21b.html}, abstract = {We study multi-objective reinforcement learning (RL) where an agent’s reward is represented as a vector. In settings where an agent competes against opponents, its performance is measured by the distance of its average return vector to a target set. We develop statistically and computationally efficient algorithms to approach the associated target set. Our results extend Blackwell’s approachability theorem \citep{blackwell1956analog} to tabular RL, where strategic exploration becomes essential. The algorithms presented are adaptive; their guarantees hold even without Blackwell’s approachability condition. If the opponents use fixed policies, we give an improved rate of approaching the target set while also tackling the more ambitious goal of simultaneously minimizing a scalar cost function. We discuss our analysis for this special case by relating our results to previous works on constrained RL. To our knowledge, this work provides the first provably efficient algorithms for vector-valued Markov games and our theoretical guarantees are near-optimal.} }
Endnote
%0 Conference Paper %T Provably Efficient Algorithms for Multi-Objective Competitive RL %A Tiancheng Yu %A Yi Tian %A Jingzhao Zhang %A Suvrit Sra %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-yu21b %I PMLR %P 12167--12176 %U https://proceedings.mlr.press/v139/yu21b.html %V 139 %X We study multi-objective reinforcement learning (RL) where an agent’s reward is represented as a vector. In settings where an agent competes against opponents, its performance is measured by the distance of its average return vector to a target set. We develop statistically and computationally efficient algorithms to approach the associated target set. Our results extend Blackwell’s approachability theorem \citep{blackwell1956analog} to tabular RL, where strategic exploration becomes essential. The algorithms presented are adaptive; their guarantees hold even without Blackwell’s approachability condition. If the opponents use fixed policies, we give an improved rate of approaching the target set while also tackling the more ambitious goal of simultaneously minimizing a scalar cost function. We discuss our analysis for this special case by relating our results to previous works on constrained RL. To our knowledge, this work provides the first provably efficient algorithms for vector-valued Markov games and our theoretical guarantees are near-optimal.
APA
Yu, T., Tian, Y., Zhang, J. & Sra, S.. (2021). Provably Efficient Algorithms for Multi-Objective Competitive RL. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:12167-12176 Available from https://proceedings.mlr.press/v139/yu21b.html.

Related Material