[edit]
A Bayesian approach for bandit online optimization with switching cost
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1953-1963, 2023.
Abstract
As a classical problem, online optimization with switching cost has been studied for a long time due to its wide applications in various areas. However, few works have investigated the bandit setting where both the forms of the main cost function f(x) evaluated at state x and the switching cost function c(x,y) of transitioning from state x to y are unknown. In this paper, we consider the situation when (f(xt)+εt,c(xt,xt−1)) can be observed with noise εt after making a decision xt at time t, aiming to minimize the expected total cost within a time horizon. To solve this problem, we propose two algorithms from a Bayesian approach, named Greedy Search and Alternating Search, respectively. They have different theoretical guarantees of competitive ratios under mild regularity conditions, and the latter algorithm achieves a faster running speed. Using simulations of two classical black-box optimization problems, we demonstrate the superior performance of our algorithms compared with the classical method.