[edit]
An Optimal Algorithm for Bandit Convex Optimization with Strongly-Convex and Smooth Loss
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2229-2239, 2020.
Abstract
We consider non-stochastic bandit convex optimization with strongly-convex and smooth loss functions. For this problem, Hazan and Levy have proposed an algorithm with a regret bound of ˜O(d3/2√T) given access to an O(d)-self-concordant barrier over the feasible region, where d and T stand for the dimensionality of the feasible region and the number of rounds, respectively. However, there are no known efficient ways for constructing self-concordant barriers for general convex sets, and a ˜O(√d) gap has remained between the upper and lower bounds, as the known regret lower bound is Ω(d√T). Our study resolves these two issues by introducing an algorithm that achieves an optimal regret bound of ˜O(d√T) under a mild assumption, without self-concordant barriers. More precisely, the algorithm requires only a membership oracle for the feasible region, and it achieves an optimal regret bound of ˜O(d√T) under the assumption that the optimal solution is an interior of the feasible region. Even without this assumption, our algorithm achieves ˜O(d3/2√T)-regret.