Online Learning with Noisy Side Observations

Tomáš Kocák; Gergely Neu; Michal Valko

Online Learning with Noisy Side Observations

Tomáš Kocák, Gergely Neu, Michal Valko

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1186-1194, 2016.

Abstract

We propose a new partial-observability model for online learning problems where the learner, besides its own loss, also observes some noisy feedback about the other actions, depending on the underlying structure of the problem. We represent this structure by a weighted directed graph, where the edge weights are related to the quality of the feedback shared by the connected nodes. Our main contribution is an efficient algorithm that guarantees a regret of O(\sqrt(α^* T) after T rounds, where α^* is a novel graph property that we call the effective independence number. Our algorithm is completely parameter-free and does not require knowledge (or even estimation) of alpha^*. For the special case of binary edge weights, our setting reduces to the partial-observability models of Mannor & Shamir (2011) and Alon et al. (2013) and our algorithm recovers the near-optimal regret bounds.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-kocak16,
  title = 	 {Online Learning with Noisy Side Observations},
  author = 	 {Kocák, Tomáš and Neu, Gergely and Valko, Michal},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1186--1194},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/kocak16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/kocak16.html},
  abstract = 	 {We propose a new partial-observability model for online learning problems where the learner, besides its own loss, also observes some noisy feedback about the other actions, depending on the underlying structure of the problem. We represent this structure by a weighted directed graph, where the edge weights are related to the quality of the feedback shared by the connected nodes. Our main contribution is an efficient algorithm that guarantees a regret of O(\sqrt(α^* T) after T rounds, where α^* is a novel  graph  property that we call the effective independence number. Our algorithm is completely parameter-free and does not require knowledge (or even estimation) of alpha^*. For the special case of binary edge weights, our setting reduces to the partial-observability models of Mannor & Shamir (2011) and Alon et al. (2013) and our algorithm recovers the near-optimal regret bounds.}
}

Endnote

%0 Conference Paper
%T Online Learning with Noisy Side Observations
%A Tomáš Kocák
%A Gergely Neu
%A Michal Valko
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-kocak16
%I PMLR
%P 1186--1194
%U https://proceedings.mlr.press/v51/kocak16.html
%V 51
%X We propose a new partial-observability model for online learning problems where the learner, besides its own loss, also observes some noisy feedback about the other actions, depending on the underlying structure of the problem. We represent this structure by a weighted directed graph, where the edge weights are related to the quality of the feedback shared by the connected nodes. Our main contribution is an efficient algorithm that guarantees a regret of O(\sqrt(α^* T) after T rounds, where α^* is a novel  graph  property that we call the effective independence number. Our algorithm is completely parameter-free and does not require knowledge (or even estimation) of alpha^*. For the special case of binary edge weights, our setting reduces to the partial-observability models of Mannor & Shamir (2011) and Alon et al. (2013) and our algorithm recovers the near-optimal regret bounds.

RIS


TY  - CPAPER
TI  - Online Learning with Noisy Side Observations
AU  - Tomáš Kocák
AU  - Gergely Neu
AU  - Michal Valko
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-kocak16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 1186
EP  - 1194
L1  - http://proceedings.mlr.press/v51/kocak16.pdf
UR  - https://proceedings.mlr.press/v51/kocak16.html
AB  - We propose a new partial-observability model for online learning problems where the learner, besides its own loss, also observes some noisy feedback about the other actions, depending on the underlying structure of the problem. We represent this structure by a weighted directed graph, where the edge weights are related to the quality of the feedback shared by the connected nodes. Our main contribution is an efficient algorithm that guarantees a regret of O(\sqrt(α^* T) after T rounds, where α^* is a novel  graph  property that we call the effective independence number. Our algorithm is completely parameter-free and does not require knowledge (or even estimation) of alpha^*. For the special case of binary edge weights, our setting reduces to the partial-observability models of Mannor & Shamir (2011) and Alon et al. (2013) and our algorithm recovers the near-optimal regret bounds.
ER  -

APA


Kocák, T., Neu, G. & Valko, M.. (2016). Online Learning with Noisy Side Observations. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1186-1194 Available from https://proceedings.mlr.press/v51/kocak16.html.

Online Learning with Noisy Side Observations

Abstract

Cite this Paper

Related Material