Bayesian Design Principles for Frequentist Sequential Learning

Yunbei Xu, Assaf Zeevi
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:38768-38800, 2023.

Abstract

We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization approach to create "algorithmic beliefs" at each round, and use Bayesian posteriors to make decisions. This is the first approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement. As a major application, we present a novel algorithm for multi-armed bandits that achieves the "best-of-all-worlds" empirical performance in the stochastic, adversarial, and non-stationary environments. And we illustrate how these principles can be used in linear bandits, convex bandits, and reinforcement learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-xu23u, title = {{B}ayesian Design Principles for Frequentist Sequential Learning}, author = {Xu, Yunbei and Zeevi, Assaf}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {38768--38800}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/xu23u/xu23u.pdf}, url = {https://proceedings.mlr.press/v202/xu23u.html}, abstract = {We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization approach to create "algorithmic beliefs" at each round, and use Bayesian posteriors to make decisions. This is the first approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement. As a major application, we present a novel algorithm for multi-armed bandits that achieves the "best-of-all-worlds" empirical performance in the stochastic, adversarial, and non-stationary environments. And we illustrate how these principles can be used in linear bandits, convex bandits, and reinforcement learning.} }
Endnote
%0 Conference Paper %T Bayesian Design Principles for Frequentist Sequential Learning %A Yunbei Xu %A Assaf Zeevi %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-xu23u %I PMLR %P 38768--38800 %U https://proceedings.mlr.press/v202/xu23u.html %V 202 %X We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization approach to create "algorithmic beliefs" at each round, and use Bayesian posteriors to make decisions. This is the first approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement. As a major application, we present a novel algorithm for multi-armed bandits that achieves the "best-of-all-worlds" empirical performance in the stochastic, adversarial, and non-stationary environments. And we illustrate how these principles can be used in linear bandits, convex bandits, and reinforcement learning.
APA
Xu, Y. & Zeevi, A.. (2023). Bayesian Design Principles for Frequentist Sequential Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:38768-38800 Available from https://proceedings.mlr.press/v202/xu23u.html.

Related Material