A Bayesian Approach for Stochastic Continuum-armed Bandit with Long-term Constraints

Zai Shi, Atilla Eryilmaz
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:8370-8391, 2022.

Abstract

Despite many valuable advances in the domain of online convex optimization over the last decade, many machine learning and networking problems of interest do not fit into that framework due to their nonconvex objectives and the presence of constraints. This motivates us in this paper to go beyond convexity and study the problem of stochastic continuum-armed bandit with long-term constraints. For noiseless observations of constraint functions, we propose a generic method using a Bayesian approach based on a class of penalty functions, and prove that it can achieve a sublinear regret with respect to the global optimum and a sublinear constraint violation (CV), which can match the best results of previous methods. Additionally, we propose another method to deal with the case where constraint functions are observed with noise, which can achieve a sublinear regret and a sublinear CV with more assumptions. Finally, we use two experiments to compare our methods with two benchmark methods in online optimization and Bayesian optimization, which demonstrates the advantages of our algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-shi22c, title = { A Bayesian Approach for Stochastic Continuum-armed Bandit with Long-term Constraints }, author = {Shi, Zai and Eryilmaz, Atilla}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {8370--8391}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/shi22c/shi22c.pdf}, url = {https://proceedings.mlr.press/v151/shi22c.html}, abstract = { Despite many valuable advances in the domain of online convex optimization over the last decade, many machine learning and networking problems of interest do not fit into that framework due to their nonconvex objectives and the presence of constraints. This motivates us in this paper to go beyond convexity and study the problem of stochastic continuum-armed bandit with long-term constraints. For noiseless observations of constraint functions, we propose a generic method using a Bayesian approach based on a class of penalty functions, and prove that it can achieve a sublinear regret with respect to the global optimum and a sublinear constraint violation (CV), which can match the best results of previous methods. Additionally, we propose another method to deal with the case where constraint functions are observed with noise, which can achieve a sublinear regret and a sublinear CV with more assumptions. Finally, we use two experiments to compare our methods with two benchmark methods in online optimization and Bayesian optimization, which demonstrates the advantages of our algorithms. } }
Endnote
%0 Conference Paper %T A Bayesian Approach for Stochastic Continuum-armed Bandit with Long-term Constraints %A Zai Shi %A Atilla Eryilmaz %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-shi22c %I PMLR %P 8370--8391 %U https://proceedings.mlr.press/v151/shi22c.html %V 151 %X Despite many valuable advances in the domain of online convex optimization over the last decade, many machine learning and networking problems of interest do not fit into that framework due to their nonconvex objectives and the presence of constraints. This motivates us in this paper to go beyond convexity and study the problem of stochastic continuum-armed bandit with long-term constraints. For noiseless observations of constraint functions, we propose a generic method using a Bayesian approach based on a class of penalty functions, and prove that it can achieve a sublinear regret with respect to the global optimum and a sublinear constraint violation (CV), which can match the best results of previous methods. Additionally, we propose another method to deal with the case where constraint functions are observed with noise, which can achieve a sublinear regret and a sublinear CV with more assumptions. Finally, we use two experiments to compare our methods with two benchmark methods in online optimization and Bayesian optimization, which demonstrates the advantages of our algorithms.
APA
Shi, Z. & Eryilmaz, A.. (2022). A Bayesian Approach for Stochastic Continuum-armed Bandit with Long-term Constraints . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:8370-8391 Available from https://proceedings.mlr.press/v151/shi22c.html.

Related Material