Universal Dynamic Regret and Constraint Violation Bounds for Constrained Online Convex Optimization

Subhamon Supantha, Abhishek Sinha
Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-29, 2026.

Abstract

We consider a generalization of the celebrated Online Convex Optimization (OCO) framework with adversarial online constraints. In this problem, an online learner interacts with an adversary sequentially over multiple rounds. At the beginning of each round, the learner chooses an action from a convex decision set. After that, the adversary reveals a convex cost function and a convex constraint function. The goal of the learner is to minimize the cumulative cost while satisfying the constraints as tightly as possible. We present two efficient algorithms with simple modular structures that give universal dynamic regret and cumulative constraint violation bounds, improving upon state-of-the-art results. While the first algorithm, which achieves the optimal regret bound, involves projection onto the constraint sets, the second algorithm is projection-free and achieves better violation bounds in rapidly varying environments. Our results hold in the most general case when both the cost and constraint functions are chosen arbitrarily, and the constraint functions need not contain any common feasible point. We establish these results by introducing a general framework that reduces the constrained learning problem to an instance of the standard OCO problem with specially constructed surrogate cost functions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v313-supantha26a, title = {Universal Dynamic Regret and Constraint Violation Bounds for Constrained Online Convex Optimization}, author = {Supantha, Subhamon and Sinha, Abhishek}, booktitle = {Proceedings of The 37th International Conference on Algorithmic Learning Theory}, pages = {1--29}, year = {2026}, editor = {Telgarsky, Matus and Ullman, Jonathan}, volume = {313}, series = {Proceedings of Machine Learning Research}, month = {23--26 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v313/main/assets/supantha26a/supantha26a.pdf}, url = {https://proceedings.mlr.press/v313/supantha26a.html}, abstract = {We consider a generalization of the celebrated Online Convex Optimization (OCO) framework with adversarial online constraints. In this problem, an online learner interacts with an adversary sequentially over multiple rounds. At the beginning of each round, the learner chooses an action from a convex decision set. After that, the adversary reveals a convex cost function and a convex constraint function. The goal of the learner is to minimize the cumulative cost while satisfying the constraints as tightly as possible. We present two efficient algorithms with simple modular structures that give universal dynamic regret and cumulative constraint violation bounds, improving upon state-of-the-art results. While the first algorithm, which achieves the optimal regret bound, involves projection onto the constraint sets, the second algorithm is projection-free and achieves better violation bounds in rapidly varying environments. Our results hold in the most general case when both the cost and constraint functions are chosen arbitrarily, and the constraint functions need not contain any common feasible point. We establish these results by introducing a general framework that reduces the constrained learning problem to an instance of the standard OCO problem with specially constructed surrogate cost functions.} }
Endnote
%0 Conference Paper %T Universal Dynamic Regret and Constraint Violation Bounds for Constrained Online Convex Optimization %A Subhamon Supantha %A Abhishek Sinha %B Proceedings of The 37th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Matus Telgarsky %E Jonathan Ullman %F pmlr-v313-supantha26a %I PMLR %P 1--29 %U https://proceedings.mlr.press/v313/supantha26a.html %V 313 %X We consider a generalization of the celebrated Online Convex Optimization (OCO) framework with adversarial online constraints. In this problem, an online learner interacts with an adversary sequentially over multiple rounds. At the beginning of each round, the learner chooses an action from a convex decision set. After that, the adversary reveals a convex cost function and a convex constraint function. The goal of the learner is to minimize the cumulative cost while satisfying the constraints as tightly as possible. We present two efficient algorithms with simple modular structures that give universal dynamic regret and cumulative constraint violation bounds, improving upon state-of-the-art results. While the first algorithm, which achieves the optimal regret bound, involves projection onto the constraint sets, the second algorithm is projection-free and achieves better violation bounds in rapidly varying environments. Our results hold in the most general case when both the cost and constraint functions are chosen arbitrarily, and the constraint functions need not contain any common feasible point. We establish these results by introducing a general framework that reduces the constrained learning problem to an instance of the standard OCO problem with specially constructed surrogate cost functions.
APA
Supantha, S. & Sinha, A.. (2026). Universal Dynamic Regret and Constraint Violation Bounds for Constrained Online Convex Optimization. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-29 Available from https://proceedings.mlr.press/v313/supantha26a.html.

Related Material