Online Learning under Budget and ROI Constraints via Weak Adaptivity

Matteo Castiglioni; Andrea Celli; Christian Kroer

Online Learning under Budget and ROI Constraints via Weak Adaptivity

Matteo Castiglioni, Andrea Celli, Christian Kroer

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:5792-5816, 2024.

Abstract

We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must know beforehand the value of parameters related to the degree of strict feasibility of the problem (i.e. Slater parameters). Second, a strictly feasible solution to the offline optimization problem must exist at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. In this paper, we show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. This results in a “dual-balancing” framework which ensures that dual variables stay sufficiently small, even in the absence of knowledge about Slater’s parameter. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions, under stochastic and adversarial inputs. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions.

Cite this Paper

BibTeX

@InProceedings{pmlr-v235-castiglioni24a,
  title = 	 {Online Learning under Budget and {ROI} Constraints via Weak Adaptivity},
  author =       {Castiglioni, Matteo and Celli, Andrea and Kroer, Christian},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {5792--5816},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/castiglioni24a/castiglioni24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/castiglioni24a.html},
  abstract = 	 {We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must know beforehand the value of parameters related to the degree of strict feasibility of the problem (i.e. Slater parameters). Second, a strictly feasible solution to the offline optimization problem must exist at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. In this paper, we show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. This results in a “dual-balancing” framework which ensures that dual variables stay sufficiently small, even in the absence of knowledge about Slater’s parameter. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions, under stochastic and adversarial inputs. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions.}
}

Endnote

%0 Conference Paper
%T Online Learning under Budget and ROI Constraints via Weak Adaptivity
%A Matteo Castiglioni
%A Andrea Celli
%A Christian Kroer
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-castiglioni24a
%I PMLR
%P 5792--5816
%U https://proceedings.mlr.press/v235/castiglioni24a.html
%V 235
%X We study online learning problems in which a decision maker has to make a sequence of costly decisions, with the goal of maximizing their expected reward while adhering to budget and return-on-investment (ROI) constraints. Existing primal-dual algorithms designed for constrained online learning problems under adversarial inputs rely on two fundamental assumptions. First, the decision maker must know beforehand the value of parameters related to the degree of strict feasibility of the problem (i.e. Slater parameters). Second, a strictly feasible solution to the offline optimization problem must exist at each round. Both requirements are unrealistic for practical applications such as bidding in online ad auctions. In this paper, we show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. This results in a “dual-balancing” framework which ensures that dual variables stay sufficiently small, even in the absence of knowledge about Slater’s parameter. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions, under stochastic and adversarial inputs. Finally, we show how to instantiate the framework to optimally bid in various mechanisms of practical relevance, such as first- and second-price auctions.

APA

Castiglioni, M., Celli, A. & Kroer, C.. (2024). Online Learning under Budget and ROI Constraints via Weak Adaptivity. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:5792-5816 Available from https://proceedings.mlr.press/v235/castiglioni24a.html.

Online Learning under Budget and ROI Constraints via Weak Adaptivity

Abstract

Cite this Paper

Related Material