[edit]
Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:3642-3643, 2024.
Abstract
We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser’s total value over multiple rounds of a repeated auction, subject to budget and return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret. Our algorithm uses only bandit feedback and can be used with the first- or second-price auction, as well as with any “intermediate” auction format. Our main result is that when these autobidders play against each other, the resulting expected liquid welfare over all rounds is at least half of the expected optimal liquid welfare achieved by any allocation. Our analysis holds whether or not the bidding dynamics converges to an equilibrium, side-stepping the dearth of provable convergence guarantees in the literature and the hardness result (Chen et al., 2021) which precludes such guarantees for budget-constrained second-price auctions. Our vanishing-regret result extends to an adversarial environment, without any assumptions on the other agents. We adopt a non-standard benchmark: the sequence of bids such that each bid $b_t$ maximizes value for the environment in round $t$. Hence, we side-step the impossibility results for the standard benchmark of best fixed bid (Balseiro and Gur, 2019). When there is only a budget constraint, our algorithm specializes to the autobidding algorithm from (Balseiro and Gur, 2019), and our guarantees specialize to the regret and liquid welfare guarantees from Gaitonde et al. (2023). While our approach to bounding liquid welfare shares a common high-level strategy with Gaitonde et al. (2023), handling the ROI constraint, and particularly both constraints jointly, introduces a variety of new technical challenges. These challenges necessitate a new algorithm, changes to the way liquid welfare bounds are established, and a different methodology for establishing regret properties.