Online Learning in the Random-Order Model

Martino Bernasconi, Andrea Celli, Riccardo Colini Baldeschi, Federico Fusco, Stefano Leonardi, Matteo Russo
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:3899-3919, 2025.

Abstract

In the random-order model for online learning, the sequence of losses is chosen upfront by an adversary and presented to the learner after a random permutation. Any random-order input is asymptotically equivalent to a stochastic i.i.d. one, but, for finite times, it may exhibit significant non-stationarity, which can hinder the performance of stochastic learning algorithms. While algorithms for adversarial inputs naturally maintain their regret guarantees in random order, simple no-regret algorithms exist for the stochastic model that fail against random-order instances. In this paper, we propose a general procedure to adapt stochastic learning algorithms to the random-order model without substantially affecting their regret guarantees. This allows us to recover improved regret bounds for prediction with delays, bandits with switching costs, and online learning with constraints. Finally, we investigate online classification and prove that, in random order, learnability is characterized by the VC dimension rather than by the Littlestone dimension, thus providing a further separation from the general adversarial model.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-bernasconi25b, title = {Online Learning in the Random-Order Model}, author = {Bernasconi, Martino and Celli, Andrea and Colini Baldeschi, Riccardo and Fusco, Federico and Leonardi, Stefano and Russo, Matteo}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {3899--3919}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/bernasconi25b/bernasconi25b.pdf}, url = {https://proceedings.mlr.press/v267/bernasconi25b.html}, abstract = {In the random-order model for online learning, the sequence of losses is chosen upfront by an adversary and presented to the learner after a random permutation. Any random-order input is asymptotically equivalent to a stochastic i.i.d. one, but, for finite times, it may exhibit significant non-stationarity, which can hinder the performance of stochastic learning algorithms. While algorithms for adversarial inputs naturally maintain their regret guarantees in random order, simple no-regret algorithms exist for the stochastic model that fail against random-order instances. In this paper, we propose a general procedure to adapt stochastic learning algorithms to the random-order model without substantially affecting their regret guarantees. This allows us to recover improved regret bounds for prediction with delays, bandits with switching costs, and online learning with constraints. Finally, we investigate online classification and prove that, in random order, learnability is characterized by the VC dimension rather than by the Littlestone dimension, thus providing a further separation from the general adversarial model.} }
Endnote
%0 Conference Paper %T Online Learning in the Random-Order Model %A Martino Bernasconi %A Andrea Celli %A Riccardo Colini Baldeschi %A Federico Fusco %A Stefano Leonardi %A Matteo Russo %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-bernasconi25b %I PMLR %P 3899--3919 %U https://proceedings.mlr.press/v267/bernasconi25b.html %V 267 %X In the random-order model for online learning, the sequence of losses is chosen upfront by an adversary and presented to the learner after a random permutation. Any random-order input is asymptotically equivalent to a stochastic i.i.d. one, but, for finite times, it may exhibit significant non-stationarity, which can hinder the performance of stochastic learning algorithms. While algorithms for adversarial inputs naturally maintain their regret guarantees in random order, simple no-regret algorithms exist for the stochastic model that fail against random-order instances. In this paper, we propose a general procedure to adapt stochastic learning algorithms to the random-order model without substantially affecting their regret guarantees. This allows us to recover improved regret bounds for prediction with delays, bandits with switching costs, and online learning with constraints. Finally, we investigate online classification and prove that, in random order, learnability is characterized by the VC dimension rather than by the Littlestone dimension, thus providing a further separation from the general adversarial model.
APA
Bernasconi, M., Celli, A., Colini Baldeschi, R., Fusco, F., Leonardi, S. & Russo, M.. (2025). Online Learning in the Random-Order Model. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:3899-3919 Available from https://proceedings.mlr.press/v267/bernasconi25b.html.

Related Material