Online Learning with Knapsacks: the Best of Both Worlds

Matteo Castiglioni, Andrea Celli, Christian Kroer
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2767-2783, 2022.

Abstract

We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. By casting the learning process over a suitably defined space of strategy mixtures, we recover strong duality on a Lagrangian relaxation of the underlying optimization problem, even for general settings with non-convex reward and resource-consumption functions. Then, we provide the first best-of-both-worlds type framework for this setting, with no-regret guarantees both under stochastic and adversarial inputs. Our framework yields the same regret guarantees of prior work in the stochastic case. On the other hand, when budgets grow at least linearly in the time horizon, it allows us to provide a constant competitive ratio in the adversarial case, which improves over the $O(m \log T)$ competitive ratio of Immorlica et al. [FOCS’19]. Moreover, our framework allows the decision maker to handle non-convex reward and cost functions. We provide two game-theoretic applications of our framework to give further evidence of its flexibility.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-castiglioni22a, title = {Online Learning with Knapsacks: the Best of Both Worlds}, author = {Castiglioni, Matteo and Celli, Andrea and Kroer, Christian}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {2767--2783}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/castiglioni22a/castiglioni22a.pdf}, url = {https://proceedings.mlr.press/v162/castiglioni22a.html}, abstract = {We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. By casting the learning process over a suitably defined space of strategy mixtures, we recover strong duality on a Lagrangian relaxation of the underlying optimization problem, even for general settings with non-convex reward and resource-consumption functions. Then, we provide the first best-of-both-worlds type framework for this setting, with no-regret guarantees both under stochastic and adversarial inputs. Our framework yields the same regret guarantees of prior work in the stochastic case. On the other hand, when budgets grow at least linearly in the time horizon, it allows us to provide a constant competitive ratio in the adversarial case, which improves over the $O(m \log T)$ competitive ratio of Immorlica et al. [FOCS’19]. Moreover, our framework allows the decision maker to handle non-convex reward and cost functions. We provide two game-theoretic applications of our framework to give further evidence of its flexibility.} }
Endnote
%0 Conference Paper %T Online Learning with Knapsacks: the Best of Both Worlds %A Matteo Castiglioni %A Andrea Celli %A Christian Kroer %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-castiglioni22a %I PMLR %P 2767--2783 %U https://proceedings.mlr.press/v162/castiglioni22a.html %V 162 %X We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of $m$ resource constraints. By casting the learning process over a suitably defined space of strategy mixtures, we recover strong duality on a Lagrangian relaxation of the underlying optimization problem, even for general settings with non-convex reward and resource-consumption functions. Then, we provide the first best-of-both-worlds type framework for this setting, with no-regret guarantees both under stochastic and adversarial inputs. Our framework yields the same regret guarantees of prior work in the stochastic case. On the other hand, when budgets grow at least linearly in the time horizon, it allows us to provide a constant competitive ratio in the adversarial case, which improves over the $O(m \log T)$ competitive ratio of Immorlica et al. [FOCS’19]. Moreover, our framework allows the decision maker to handle non-convex reward and cost functions. We provide two game-theoretic applications of our framework to give further evidence of its flexibility.
APA
Castiglioni, M., Celli, A. & Kroer, C.. (2022). Online Learning with Knapsacks: the Best of Both Worlds. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:2767-2783 Available from https://proceedings.mlr.press/v162/castiglioni22a.html.

Related Material