Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit

Julien Zhou, Pierre Gaillard, Thibaud Rahier, Julyan Arbel
Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:1361-1385, 2025.

Abstract

We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a non monotone submodular function taking values in a known bounded interval. This paper proposes Double-Greedy - Explore-then-Commit (DG-ETC), adapting the Double-Greedy approach from the offline and online full-information settings. DG-ETC satisfies a O(dlog(dT)) problem-dependent upper bound for the 1/2-approximate pseudo-regret, as well as a O(dT2/3log(dT)1/3) problem-free one at the same time, outperforming existing approaches. In particular, we introduce a problem-dependent notion of hardness characterizing the transition between logarithmic and polynomial regime for the upper bounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v272-zhou25a, title = {Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit}, author = {Zhou, Julien and Gaillard, Pierre and Rahier, Thibaud and Arbel, Julyan}, booktitle = {Proceedings of The 36th International Conference on Algorithmic Learning Theory}, pages = {1361--1385}, year = {2025}, editor = {Kamath, Gautam and Loh, Po-Ling}, volume = {272}, series = {Proceedings of Machine Learning Research}, month = {24--27 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v272/main/assets/zhou25a/zhou25a.pdf}, url = {https://proceedings.mlr.press/v272/zhou25a.html}, abstract = {We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a non monotone submodular function taking values in a known bounded interval. This paper proposes Double-Greedy - Explore-then-Commit (DG-ETC), adapting the Double-Greedy approach from the offline and online full-information settings. DG-ETC satisfies a $O(d\log(dT))$ problem-dependent upper bound for the $1/2$-approximate pseudo-regret, as well as a $O(dT^{2/3}\log(dT)^{1/3})$ problem-free one at the same time, outperforming existing approaches. In particular, we introduce a problem-dependent notion of hardness characterizing the transition between logarithmic and polynomial regime for the upper bounds.} }
Endnote
%0 Conference Paper %T Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit %A Julien Zhou %A Pierre Gaillard %A Thibaud Rahier %A Julyan Arbel %B Proceedings of The 36th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Gautam Kamath %E Po-Ling Loh %F pmlr-v272-zhou25a %I PMLR %P 1361--1385 %U https://proceedings.mlr.press/v272/zhou25a.html %V 272 %X We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a non monotone submodular function taking values in a known bounded interval. This paper proposes Double-Greedy - Explore-then-Commit (DG-ETC), adapting the Double-Greedy approach from the offline and online full-information settings. DG-ETC satisfies a $O(d\log(dT))$ problem-dependent upper bound for the $1/2$-approximate pseudo-regret, as well as a $O(dT^{2/3}\log(dT)^{1/3})$ problem-free one at the same time, outperforming existing approaches. In particular, we introduce a problem-dependent notion of hardness characterizing the transition between logarithmic and polynomial regime for the upper bounds.
APA
Zhou, J., Gaillard, P., Rahier, T. & Arbel, J.. (2025). Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit. Proceedings of The 36th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 272:1361-1385 Available from https://proceedings.mlr.press/v272/zhou25a.html.

Related Material