Fast Rate Learning in Stochastic First Price Bidding

Juliette Achddou, Olivier Cappé, Aurélien Garivier
Proceedings of The 13th Asian Conference on Machine Learning, PMLR 157:1754-1769, 2021.

Abstract

First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. % As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. % They have already given rise to several works in sequential learning, % many of which consider models for which the value of the buyer or the opponents’ maximal bid is chosen in an adversarial manner. Even in the simplest settings, this gives rise to algorithms whose pseudo-regret grows as $\sqrt{T}$ with respect to the time horizon $T$. % Focusing on the case where the buyer plays against a stationary stochastic environment, we show how to achieve significantly lower pseudo-regret: when the opponents’ maximal bid distribution is known we provide an algorithm whose pseudo-regret can be as low as $\log^2(T)$; in the case where the distribution must be learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+ \epsilon}$ pseudo-regret, for any $\epsilon>0$. % To obtain these results, we introduce two novel ideas that can be of interest in their own right. First, by transposing results obtained in the posted price setting, we provide conditions under which the first-price bidding utility is locally quadratic around its optimum. Second, we leverage the observation that, on small sub-intervals, the concentration of the variations of the empirical distribution function may be controlled more accurately than by using the classical Dvoretzky-Kiefer-Wolfowitz inequality. % Numerical simulations confirm that our algorithms converge much faster than alternatives proposed in the literature for various bid distributions, including for bids collected on an actual programmatic advertising platform.

Cite this Paper


BibTeX
@InProceedings{pmlr-v157-achddou21a, title = {Fast Rate Learning in Stochastic First Price Bidding}, author = {Achddou, Juliette and Capp\'e, Olivier and Garivier, Aur\'elien}, booktitle = {Proceedings of The 13th Asian Conference on Machine Learning}, pages = {1754--1769}, year = {2021}, editor = {Balasubramanian, Vineeth N. and Tsang, Ivor}, volume = {157}, series = {Proceedings of Machine Learning Research}, month = {17--19 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v157/achddou21a/achddou21a.pdf}, url = {https://proceedings.mlr.press/v157/achddou21a.html}, abstract = {First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. % As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. % They have already given rise to several works in sequential learning, % many of which consider models for which the value of the buyer or the opponents’ maximal bid is chosen in an adversarial manner. Even in the simplest settings, this gives rise to algorithms whose pseudo-regret grows as $\sqrt{T}$ with respect to the time horizon $T$. % Focusing on the case where the buyer plays against a stationary stochastic environment, we show how to achieve significantly lower pseudo-regret: when the opponents’ maximal bid distribution is known we provide an algorithm whose pseudo-regret can be as low as $\log^2(T)$; in the case where the distribution must be learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+ \epsilon}$ pseudo-regret, for any $\epsilon>0$. % To obtain these results, we introduce two novel ideas that can be of interest in their own right. First, by transposing results obtained in the posted price setting, we provide conditions under which the first-price bidding utility is locally quadratic around its optimum. Second, we leverage the observation that, on small sub-intervals, the concentration of the variations of the empirical distribution function may be controlled more accurately than by using the classical Dvoretzky-Kiefer-Wolfowitz inequality. % Numerical simulations confirm that our algorithms converge much faster than alternatives proposed in the literature for various bid distributions, including for bids collected on an actual programmatic advertising platform.} }
Endnote
%0 Conference Paper %T Fast Rate Learning in Stochastic First Price Bidding %A Juliette Achddou %A Olivier Cappé %A Aurélien Garivier %B Proceedings of The 13th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Vineeth N. Balasubramanian %E Ivor Tsang %F pmlr-v157-achddou21a %I PMLR %P 1754--1769 %U https://proceedings.mlr.press/v157/achddou21a.html %V 157 %X First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. % As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. % They have already given rise to several works in sequential learning, % many of which consider models for which the value of the buyer or the opponents’ maximal bid is chosen in an adversarial manner. Even in the simplest settings, this gives rise to algorithms whose pseudo-regret grows as $\sqrt{T}$ with respect to the time horizon $T$. % Focusing on the case where the buyer plays against a stationary stochastic environment, we show how to achieve significantly lower pseudo-regret: when the opponents’ maximal bid distribution is known we provide an algorithm whose pseudo-regret can be as low as $\log^2(T)$; in the case where the distribution must be learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+ \epsilon}$ pseudo-regret, for any $\epsilon>0$. % To obtain these results, we introduce two novel ideas that can be of interest in their own right. First, by transposing results obtained in the posted price setting, we provide conditions under which the first-price bidding utility is locally quadratic around its optimum. Second, we leverage the observation that, on small sub-intervals, the concentration of the variations of the empirical distribution function may be controlled more accurately than by using the classical Dvoretzky-Kiefer-Wolfowitz inequality. % Numerical simulations confirm that our algorithms converge much faster than alternatives proposed in the literature for various bid distributions, including for bids collected on an actual programmatic advertising platform.
APA
Achddou, J., Cappé, O. & Garivier, A.. (2021). Fast Rate Learning in Stochastic First Price Bidding. Proceedings of The 13th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 157:1754-1769 Available from https://proceedings.mlr.press/v157/achddou21a.html.

Related Material