Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints

Nikolaos Liakopoulos, Apostolos Destounis, Georgios Paschos, Thrasyvoulos Spyropoulos, Panayotis Mertikopoulos
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3944-3952, 2019.

Abstract

We study a class of online convex optimization problems with long-term budget constraints that arise naturally as reliability guarantees or total consumption constraints. In this general setting, prior work by Mannor et al. (2009) has shown that achieving no regret is impossible if the functions defining the agent’s budget are chosen by an adversary. To overcome this obstacle, we refine the agent’s regret metric by introducing the notion of a "K-benchmark", i.e., a comparator which meets the problem’s allotted budget over any window of length K. The impossibility analysis of Mannor et al. (2009) is recovered when K=T; however, for K=o(T), we show that it is possible to minimize regret while still meeting the problem’s long-term budget constraints. We achieve this via an online learning policy based on Cautious Online Lagrangiant Descent (COLD) for which we derive explicit bounds, in terms of both the incurred regret and the residual budget violations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-liakopoulos19a, title = {Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints}, author = {Liakopoulos, Nikolaos and Destounis, Apostolos and Paschos, Georgios and Spyropoulos, Thrasyvoulos and Mertikopoulos, Panayotis}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3944--3952}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/liakopoulos19a/liakopoulos19a.pdf}, url = {https://proceedings.mlr.press/v97/liakopoulos19a.html}, abstract = {We study a class of online convex optimization problems with long-term budget constraints that arise naturally as reliability guarantees or total consumption constraints. In this general setting, prior work by Mannor et al. (2009) has shown that achieving no regret is impossible if the functions defining the agent’s budget are chosen by an adversary. To overcome this obstacle, we refine the agent’s regret metric by introducing the notion of a "K-benchmark", i.e., a comparator which meets the problem’s allotted budget over any window of length K. The impossibility analysis of Mannor et al. (2009) is recovered when K=T; however, for K=o(T), we show that it is possible to minimize regret while still meeting the problem’s long-term budget constraints. We achieve this via an online learning policy based on Cautious Online Lagrangiant Descent (COLD) for which we derive explicit bounds, in terms of both the incurred regret and the residual budget violations.} }
Endnote
%0 Conference Paper %T Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints %A Nikolaos Liakopoulos %A Apostolos Destounis %A Georgios Paschos %A Thrasyvoulos Spyropoulos %A Panayotis Mertikopoulos %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-liakopoulos19a %I PMLR %P 3944--3952 %U https://proceedings.mlr.press/v97/liakopoulos19a.html %V 97 %X We study a class of online convex optimization problems with long-term budget constraints that arise naturally as reliability guarantees or total consumption constraints. In this general setting, prior work by Mannor et al. (2009) has shown that achieving no regret is impossible if the functions defining the agent’s budget are chosen by an adversary. To overcome this obstacle, we refine the agent’s regret metric by introducing the notion of a "K-benchmark", i.e., a comparator which meets the problem’s allotted budget over any window of length K. The impossibility analysis of Mannor et al. (2009) is recovered when K=T; however, for K=o(T), we show that it is possible to minimize regret while still meeting the problem’s long-term budget constraints. We achieve this via an online learning policy based on Cautious Online Lagrangiant Descent (COLD) for which we derive explicit bounds, in terms of both the incurred regret and the residual budget violations.
APA
Liakopoulos, N., Destounis, A., Paschos, G., Spyropoulos, T. & Mertikopoulos, P.. (2019). Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3944-3952 Available from https://proceedings.mlr.press/v97/liakopoulos19a.html.

Related Material