A Bayesian approach for bandit online optimization with switching cost

Zai Shi, Jian Tan, Feifei Li
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1953-1963, 2023.

Abstract

As a classical problem, online optimization with switching cost has been studied for a long time due to its wide applications in various areas. However, few works have investigated the bandit setting where both the forms of the main cost function f(x) evaluated at state x and the switching cost function c(x,y) of transitioning from state x to y are unknown. In this paper, we consider the situation when (f(xt)+εt,c(xt,xt1)) can be observed with noise εt after making a decision xt at time t, aiming to minimize the expected total cost within a time horizon. To solve this problem, we propose two algorithms from a Bayesian approach, named Greedy Search and Alternating Search, respectively. They have different theoretical guarantees of competitive ratios under mild regularity conditions, and the latter algorithm achieves a faster running speed. Using simulations of two classical black-box optimization problems, we demonstrate the superior performance of our algorithms compared with the classical method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-shi23b, title = {A {B}ayesian approach for bandit online optimization with switching cost}, author = {Shi, Zai and Tan, Jian and Li, Feifei}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {1953--1963}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/shi23b/shi23b.pdf}, url = {https://proceedings.mlr.press/v216/shi23b.html}, abstract = {As a classical problem, online optimization with switching cost has been studied for a long time due to its wide applications in various areas. However, few works have investigated the bandit setting where both the forms of the main cost function $f(x)$ evaluated at state $x$ and the switching cost function $c(x, y)$ of transitioning from state $x$ to $y$ are unknown. In this paper, we consider the situation when $\left(f(x_t)+\varepsilon_t,\,{c}(x_t, x_{t-1})\right)$ can be observed with noise $\varepsilon_t$ after making a decision $x_t$ at time $t$, aiming to minimize the expected total cost within a time horizon. To solve this problem, we propose two algorithms from a Bayesian approach, named Greedy Search and Alternating Search, respectively. They have different theoretical guarantees of competitive ratios under mild regularity conditions, and the latter algorithm achieves a faster running speed. Using simulations of two classical black-box optimization problems, we demonstrate the superior performance of our algorithms compared with the classical method.} }
Endnote
%0 Conference Paper %T A Bayesian approach for bandit online optimization with switching cost %A Zai Shi %A Jian Tan %A Feifei Li %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-shi23b %I PMLR %P 1953--1963 %U https://proceedings.mlr.press/v216/shi23b.html %V 216 %X As a classical problem, online optimization with switching cost has been studied for a long time due to its wide applications in various areas. However, few works have investigated the bandit setting where both the forms of the main cost function $f(x)$ evaluated at state $x$ and the switching cost function $c(x, y)$ of transitioning from state $x$ to $y$ are unknown. In this paper, we consider the situation when $\left(f(x_t)+\varepsilon_t,\,{c}(x_t, x_{t-1})\right)$ can be observed with noise $\varepsilon_t$ after making a decision $x_t$ at time $t$, aiming to minimize the expected total cost within a time horizon. To solve this problem, we propose two algorithms from a Bayesian approach, named Greedy Search and Alternating Search, respectively. They have different theoretical guarantees of competitive ratios under mild regularity conditions, and the latter algorithm achieves a faster running speed. Using simulations of two classical black-box optimization problems, we demonstrate the superior performance of our algorithms compared with the classical method.
APA
Shi, Z., Tan, J. & Li, F.. (2023). A Bayesian approach for bandit online optimization with switching cost. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1953-1963 Available from https://proceedings.mlr.press/v216/shi23b.html.

Related Material