Adaptively Perturbed Mirror Descent for Learning in Games

Kenshi Abe, Kaito Ariu, Mitsuki Sakamoto, Atsushi Iwasaki
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:31-80, 2024.

Abstract

This paper proposes a payoff perturbation technique for the Mirror Descent (MD) algorithm in games where the gradient of the payoff functions is monotone in the strategy profile space, potentially containing additive noise. The optimistic family of learning algorithms, exemplified by optimistic MD, successfully achieves last-iterate convergence in scenarios devoid of noise, leading the dynamics to a Nash equilibrium. A recent re-emerging trend underscores the promise of the perturbation approach, where payoff functions are perturbed based on the distance from an anchoring, or slingshot, strategy. In response, we propose Adaptively Perturbed MD (APMD), which adjusts the magnitude of the perturbation by repeatedly updating the slingshot strategy at a predefined interval. This innovation empowers us to find a Nash equilibrium of the underlying game with guaranteed rates. Empirical demonstrations affirm that our algorithm exhibits significantly accelerated convergence.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-abe24a, title = {Adaptively Perturbed Mirror Descent for Learning in Games}, author = {Abe, Kenshi and Ariu, Kaito and Sakamoto, Mitsuki and Iwasaki, Atsushi}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {31--80}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/abe24a/abe24a.pdf}, url = {https://proceedings.mlr.press/v235/abe24a.html}, abstract = {This paper proposes a payoff perturbation technique for the Mirror Descent (MD) algorithm in games where the gradient of the payoff functions is monotone in the strategy profile space, potentially containing additive noise. The optimistic family of learning algorithms, exemplified by optimistic MD, successfully achieves last-iterate convergence in scenarios devoid of noise, leading the dynamics to a Nash equilibrium. A recent re-emerging trend underscores the promise of the perturbation approach, where payoff functions are perturbed based on the distance from an anchoring, or slingshot, strategy. In response, we propose Adaptively Perturbed MD (APMD), which adjusts the magnitude of the perturbation by repeatedly updating the slingshot strategy at a predefined interval. This innovation empowers us to find a Nash equilibrium of the underlying game with guaranteed rates. Empirical demonstrations affirm that our algorithm exhibits significantly accelerated convergence.} }
Endnote
%0 Conference Paper %T Adaptively Perturbed Mirror Descent for Learning in Games %A Kenshi Abe %A Kaito Ariu %A Mitsuki Sakamoto %A Atsushi Iwasaki %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-abe24a %I PMLR %P 31--80 %U https://proceedings.mlr.press/v235/abe24a.html %V 235 %X This paper proposes a payoff perturbation technique for the Mirror Descent (MD) algorithm in games where the gradient of the payoff functions is monotone in the strategy profile space, potentially containing additive noise. The optimistic family of learning algorithms, exemplified by optimistic MD, successfully achieves last-iterate convergence in scenarios devoid of noise, leading the dynamics to a Nash equilibrium. A recent re-emerging trend underscores the promise of the perturbation approach, where payoff functions are perturbed based on the distance from an anchoring, or slingshot, strategy. In response, we propose Adaptively Perturbed MD (APMD), which adjusts the magnitude of the perturbation by repeatedly updating the slingshot strategy at a predefined interval. This innovation empowers us to find a Nash equilibrium of the underlying game with guaranteed rates. Empirical demonstrations affirm that our algorithm exhibits significantly accelerated convergence.
APA
Abe, K., Ariu, K., Sakamoto, M. & Iwasaki, A.. (2024). Adaptively Perturbed Mirror Descent for Learning in Games. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:31-80 Available from https://proceedings.mlr.press/v235/abe24a.html.

Related Material