The SMART approach to instance-optimal online learning

Siddhartha Banerjee, Alankrita Bhatt, Christina Lee Yu
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:426-426, 2024.

Abstract

We devise an online learning algorithm – titled Switching via Monotone Adapted Regret Traces (SMART) – that adapts to the data and achieves regret that is instance optimal, i.e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy. We show that the regret of the SMART policy on any input sequence is within a multiplicative factor e/(e-1), approximately 1.58, of the smaller of: 1) the regret obtained by FTL on the sequence, and 2) the upper bound on regret guaranteed by the given worst-case policy. This implies a strictly stronger guarantee than typical ‘best-of-both-worlds’ bounds as the guarantee holds for every input sequence regardless of how it is generated. SMART is simple to implement as it begins by playing FTL and switches at most once during the time horizon to the worst-case algorithm. Our approach and results follow from a reduction of instance optimal online learning to competitive analysis for the ski-rental problem. We complement our competitive ratio upper bounds with a fundamental lower bound showing that over all input sequences, no algorithm can get better than a 1.43-fraction of the minimum regret achieved by FTL and the minimax-optimal policy. We present a modification of SMART that combines FTL with a “small-loss" algorithm to achieve instance optimality between the regret of FTL and the small loss regret bound.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-banerjee24a, title = {The SMART approach to instance-optimal online learning}, author = {Banerjee, Siddhartha and Bhatt, Alankrita and Yu, Christina Lee}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {426--426}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/banerjee24a/banerjee24a.pdf}, url = {https://proceedings.mlr.press/v247/banerjee24a.html}, abstract = {We devise an online learning algorithm – titled Switching via Monotone Adapted Regret Traces (SMART) – that adapts to the data and achieves regret that is instance optimal, i.e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy. We show that the regret of the SMART policy on any input sequence is within a multiplicative factor e/(e-1), approximately 1.58, of the smaller of: 1) the regret obtained by FTL on the sequence, and 2) the upper bound on regret guaranteed by the given worst-case policy. This implies a strictly stronger guarantee than typical ‘best-of-both-worlds’ bounds as the guarantee holds for every input sequence regardless of how it is generated. SMART is simple to implement as it begins by playing FTL and switches at most once during the time horizon to the worst-case algorithm. Our approach and results follow from a reduction of instance optimal online learning to competitive analysis for the ski-rental problem. We complement our competitive ratio upper bounds with a fundamental lower bound showing that over all input sequences, no algorithm can get better than a 1.43-fraction of the minimum regret achieved by FTL and the minimax-optimal policy. We present a modification of SMART that combines FTL with a “small-loss" algorithm to achieve instance optimality between the regret of FTL and the small loss regret bound. } }
Endnote
%0 Conference Paper %T The SMART approach to instance-optimal online learning %A Siddhartha Banerjee %A Alankrita Bhatt %A Christina Lee Yu %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-banerjee24a %I PMLR %P 426--426 %U https://proceedings.mlr.press/v247/banerjee24a.html %V 247 %X We devise an online learning algorithm – titled Switching via Monotone Adapted Regret Traces (SMART) – that adapts to the data and achieves regret that is instance optimal, i.e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy. We show that the regret of the SMART policy on any input sequence is within a multiplicative factor e/(e-1), approximately 1.58, of the smaller of: 1) the regret obtained by FTL on the sequence, and 2) the upper bound on regret guaranteed by the given worst-case policy. This implies a strictly stronger guarantee than typical ‘best-of-both-worlds’ bounds as the guarantee holds for every input sequence regardless of how it is generated. SMART is simple to implement as it begins by playing FTL and switches at most once during the time horizon to the worst-case algorithm. Our approach and results follow from a reduction of instance optimal online learning to competitive analysis for the ski-rental problem. We complement our competitive ratio upper bounds with a fundamental lower bound showing that over all input sequences, no algorithm can get better than a 1.43-fraction of the minimum regret achieved by FTL and the minimax-optimal policy. We present a modification of SMART that combines FTL with a “small-loss" algorithm to achieve instance optimality between the regret of FTL and the small loss regret bound.
APA
Banerjee, S., Bhatt, A. & Yu, C.L.. (2024). The SMART approach to instance-optimal online learning. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:426-426 Available from https://proceedings.mlr.press/v247/banerjee24a.html.

Related Material