Online Convex Optimization with Perturbed Constraints: Optimal Rates against Stronger Benchmarks
[edit]
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:28852895, 2020.
Abstract
This paper studies Online Convex Optimization (OCO) problems where the constraints have additive perturbations that (i) vary over time and (ii) are not known at the time to make a decision. Perturbations may not be i.i.d. generated and can be used, for example, to model a timevarying budget or timevarying requests in resource allocation problems. Our goal is to design a policy that obtains sublinear regret and satisfies the constraints in the longterm. To this end, we present an online primaldual proximal gradient algorithm that has $O(T^\epsilon \vee T^{1\epsilon})$ regret and $O(T^\epsilon)$ constraint violation, where $\epsilon \in [0,1)$ is a parameter in the learning rate. The proposed algorithm obtains optimal rates when $\epsilon = 1/2$, and can compare against a stronger comparator (the set of fixed decisions in hindsight) than previous work.
Related Material


