Online Learning under Delayed Feedback

Pooria Joulani; Andras Gyorgy; Csaba Szepesvari

Online Learning under Delayed Feedback

Pooria Joulani, Andras Gyorgy, Csaba Szepesvari

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1453-1461, 2013.

Abstract

Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-joulani13,
  title = 	 {Online Learning under Delayed Feedback},
  author = 	 {Joulani, Pooria and Gyorgy, Andras and Szepesvari, Csaba},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {1453--1461},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/joulani13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/joulani13.html},
  abstract = 	 {Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.}
}

Endnote

%0 Conference Paper
%T Online Learning under Delayed Feedback
%A Pooria Joulani
%A Andras Gyorgy
%A Csaba Szepesvari
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-joulani13
%I PMLR
%P 1453--1461
%U https://proceedings.mlr.press/v28/joulani13.html
%V 28
%N 3
%X Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.

RIS


TY  - CPAPER
TI  - Online Learning under Delayed Feedback
AU  - Pooria Joulani
AU  - Andras Gyorgy
AU  - Csaba Szepesvari
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-joulani13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 1453
EP  - 1461
L1  - http://proceedings.mlr.press/v28/joulani13.pdf
UR  - https://proceedings.mlr.press/v28/joulani13.html
AB  - Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.
ER  -

APA


Joulani, P., Gyorgy, A. & Szepesvari, C.. (2013). Online Learning under Delayed Feedback. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1453-1461 Available from https://proceedings.mlr.press/v28/joulani13.html.

Related Material

Download PDF