[edit]
A Bayesian approach for bandit online optimization with switching cost
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1953-1963, 2023.
Abstract
As a classical problem, online optimization with switching cost has been studied for a long time due to its wide applications in various areas. However, few works have investigated the bandit setting where both the forms of the main cost function $f(x)$ evaluated at state $x$ and the switching cost function $c(x, y)$ of transitioning from state $x$ to $y$ are unknown. In this paper, we consider the situation when $\left(f(x_t)+\varepsilon_t,\,{c}(x_t, x_{t-1})\right)$ can be observed with noise $\varepsilon_t$ after making a decision $x_t$ at time $t$, aiming to minimize the expected total cost within a time horizon. To solve this problem, we propose two algorithms from a Bayesian approach, named Greedy Search and Alternating Search, respectively. They have different theoretical guarantees of competitive ratios under mild regularity conditions, and the latter algorithm achieves a faster running speed. Using simulations of two classical black-box optimization problems, we demonstrate the superior performance of our algorithms compared with the classical method.