Online Learning and Blackwell Approachability in Quitting Games

Janos Flesch; Rida Laraki; Vianney Perchet

Online Learning and Blackwell Approachability in Quitting Games

Janos Flesch, Rida Laraki, Vianney Perchet

29th Annual Conference on Learning Theory, PMLR 49:941-942, 2016.

Abstract

We consider the sequential decision problem known as regret minimization, or more precisely its generalization to the vectorial or multi-criteria setup called Blackwell approachability. We assume that Nature, the decision maker, or both, might have some quitting (or terminating) actions so that the stream of payoffs is constant whenever they are chosen. We call those environments “quitting games”. We characterize convex target sets \cC that are Blackwell approachable, in the sense that the decision maker has a policy ensuring that the expected average vector payoff converges to \cC at some given horizon known in advance. Moreover, we also compare these results to the cases where the horizon is not known and show that, unlike in standard online learning literature, the necessary or sufficient conditions for the anytime version of this problem are drastically different than those for the fixed horizon.

Cite this Paper

BibTeX


@InProceedings{pmlr-v49-flesch16,
  title = 	 {Online Learning and Blackwell Approachability in Quitting Games},
  author = 	 {Flesch, Janos and Laraki, Rida and Perchet, Vianney},
  booktitle = 	 {29th Annual Conference on Learning Theory},
  pages = 	 {941--942},
  year = 	 {2016},
  editor = 	 {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad},
  volume = 	 {49},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Columbia University, New York, New York, USA},
  month = 	 {23--26 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v49/flesch16.pdf},
  url = 	 {https://proceedings.mlr.press/v49/flesch16.html},
  abstract = 	 {We consider  the sequential decision problem known as regret minimization, or more precisely its generalization to the  vectorial or multi-criteria  setup called Blackwell approachability. We  assume that   Nature, the decision maker, or both, might have some quitting (or terminating) actions so that the stream of payoffs is constant whenever they are chosen. We call those environments  “quitting games”. We characterize convex target sets \cC that are Blackwell approachable, in the sense that the decision maker has a policy ensuring that the expected average vector payoff converges to \cC at some given horizon known in advance. Moreover, we also compare these results to the cases where  the horizon is not known and show that, unlike in standard online learning literature, the necessary or sufficient conditions for the anytime version of this problem are drastically different than those for the fixed horizon. }
}

Endnote

%0 Conference Paper
%T Online Learning and Blackwell Approachability in Quitting Games
%A Janos Flesch
%A Rida Laraki
%A Vianney Perchet
%B 29th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2016
%E Vitaly Feldman
%E Alexander Rakhlin
%E Ohad Shamir	
%F pmlr-v49-flesch16
%I PMLR
%P 941--942
%U https://proceedings.mlr.press/v49/flesch16.html
%V 49
%X We consider  the sequential decision problem known as regret minimization, or more precisely its generalization to the  vectorial or multi-criteria  setup called Blackwell approachability. We  assume that   Nature, the decision maker, or both, might have some quitting (or terminating) actions so that the stream of payoffs is constant whenever they are chosen. We call those environments  “quitting games”. We characterize convex target sets \cC that are Blackwell approachable, in the sense that the decision maker has a policy ensuring that the expected average vector payoff converges to \cC at some given horizon known in advance. Moreover, we also compare these results to the cases where  the horizon is not known and show that, unlike in standard online learning literature, the necessary or sufficient conditions for the anytime version of this problem are drastically different than those for the fixed horizon.

RIS


TY  - CPAPER
TI  - Online Learning and Blackwell Approachability in Quitting Games
AU  - Janos Flesch
AU  - Rida Laraki
AU  - Vianney Perchet
BT  - 29th Annual Conference on Learning Theory
DA  - 2016/06/06
ED  - Vitaly Feldman
ED  - Alexander Rakhlin
ED  - Ohad Shamir	
ID  - pmlr-v49-flesch16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 49
SP  - 941
EP  - 942
L1  - http://proceedings.mlr.press/v49/flesch16.pdf
UR  - https://proceedings.mlr.press/v49/flesch16.html
AB  - We consider  the sequential decision problem known as regret minimization, or more precisely its generalization to the  vectorial or multi-criteria  setup called Blackwell approachability. We  assume that   Nature, the decision maker, or both, might have some quitting (or terminating) actions so that the stream of payoffs is constant whenever they are chosen. We call those environments  “quitting games”. We characterize convex target sets \cC that are Blackwell approachable, in the sense that the decision maker has a policy ensuring that the expected average vector payoff converges to \cC at some given horizon known in advance. Moreover, we also compare these results to the cases where  the horizon is not known and show that, unlike in standard online learning literature, the necessary or sufficient conditions for the anytime version of this problem are drastically different than those for the fixed horizon. 
ER  -

APA


Flesch, J., Laraki, R. & Perchet, V.. (2016). Online Learning and Blackwell Approachability in Quitting Games. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:941-942 Available from https://proceedings.mlr.press/v49/flesch16.html.

Related Material

Download PDF