Online Learning for Active Cache Synchronization

Andrey Kolobov, Sebastien Bubeck, Julian Zimmert
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:5371-5380, 2020.

Abstract

Existing multi-armed bandit (MAB) models make two implicit assumptions: an arm generates a payoff only when it is played, and the agent observes every payoff that is generated. This paper introduces synchronization bandits, a MAB variant where all arms generate costs at all times, but the agent observes an arm’s instantaneous cost only when the arm is played. Synchronization MABs are inspired by online caching scenarios such as Web crawling, where an arm corresponds to a cached item and playing the arm means downloading its fresh copy from a server. We present MirrorSync, an online learning algorithm for synchronization bandits, establish an adversarial regret of $O(T^{2/3})$ for it, and show how to make it practical.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-kolobov20a, title = {Online Learning for Active Cache Synchronization}, author = {Kolobov, Andrey and Bubeck, Sebastien and Zimmert, Julian}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {5371--5380}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/kolobov20a/kolobov20a.pdf}, url = {https://proceedings.mlr.press/v119/kolobov20a.html}, abstract = {Existing multi-armed bandit (MAB) models make two implicit assumptions: an arm generates a payoff only when it is played, and the agent observes every payoff that is generated. This paper introduces synchronization bandits, a MAB variant where all arms generate costs at all times, but the agent observes an arm’s instantaneous cost only when the arm is played. Synchronization MABs are inspired by online caching scenarios such as Web crawling, where an arm corresponds to a cached item and playing the arm means downloading its fresh copy from a server. We present MirrorSync, an online learning algorithm for synchronization bandits, establish an adversarial regret of $O(T^{2/3})$ for it, and show how to make it practical.} }
Endnote
%0 Conference Paper %T Online Learning for Active Cache Synchronization %A Andrey Kolobov %A Sebastien Bubeck %A Julian Zimmert %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-kolobov20a %I PMLR %P 5371--5380 %U https://proceedings.mlr.press/v119/kolobov20a.html %V 119 %X Existing multi-armed bandit (MAB) models make two implicit assumptions: an arm generates a payoff only when it is played, and the agent observes every payoff that is generated. This paper introduces synchronization bandits, a MAB variant where all arms generate costs at all times, but the agent observes an arm’s instantaneous cost only when the arm is played. Synchronization MABs are inspired by online caching scenarios such as Web crawling, where an arm corresponds to a cached item and playing the arm means downloading its fresh copy from a server. We present MirrorSync, an online learning algorithm for synchronization bandits, establish an adversarial regret of $O(T^{2/3})$ for it, and show how to make it practical.
APA
Kolobov, A., Bubeck, S. & Zimmert, J.. (2020). Online Learning for Active Cache Synchronization. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:5371-5380 Available from https://proceedings.mlr.press/v119/kolobov20a.html.

Related Material