Online Learning under Delayed Feedback

Pooria Joulani, Andras Gyorgy, Csaba Szepesvari
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1453-1461, 2013.

Abstract

Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-joulani13, title = {Online Learning under Delayed Feedback}, author = {Joulani, Pooria and Gyorgy, Andras and Szepesvari, Csaba}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {1453--1461}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/joulani13.pdf}, url = {https://proceedings.mlr.press/v28/joulani13.html}, abstract = {Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.} }
Endnote
%0 Conference Paper %T Online Learning under Delayed Feedback %A Pooria Joulani %A Andras Gyorgy %A Csaba Szepesvari %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-joulani13 %I PMLR %P 1453--1461 %U https://proceedings.mlr.press/v28/joulani13.html %V 28 %N 3 %X Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.
RIS
TY - CPAPER TI - Online Learning under Delayed Feedback AU - Pooria Joulani AU - Andras Gyorgy AU - Csaba Szepesvari BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/05/26 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-joulani13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 3 SP - 1453 EP - 1461 L1 - http://proceedings.mlr.press/v28/joulani13.pdf UR - https://proceedings.mlr.press/v28/joulani13.html AB - Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity. ER -
APA
Joulani, P., Gyorgy, A. & Szepesvari, C.. (2013). Online Learning under Delayed Feedback. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1453-1461 Available from https://proceedings.mlr.press/v28/joulani13.html.

Related Material