Pure Exploration and Regret Minimization in Matching Bandits

Flore Sentenac; Jialin Yi; Clement Calauzenes; Vianney Perchet; Milan Vojnovic

Pure Exploration and Regret Minimization in Matching Bandits

Flore Sentenac, Jialin Yi, Clement Calauzenes, Vianney Perchet, Milan Vojnovic

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9434-9442, 2021.

Abstract

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to to poly-log terms).

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-sentenac21a,
  title = 	 {Pure Exploration and Regret Minimization in Matching Bandits},
  author =       {Sentenac, Flore and Yi, Jialin and Calauzenes, Clement and Perchet, Vianney and Vojnovic, Milan},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {9434--9442},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/sentenac21a/sentenac21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/sentenac21a.html},
  abstract = 	 {Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to to poly-log terms).}
}

Endnote

%0 Conference Paper
%T Pure Exploration and Regret Minimization in Matching Bandits
%A Flore Sentenac
%A Jialin Yi
%A Clement Calauzenes
%A Vianney Perchet
%A Milan Vojnovic
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-sentenac21a
%I PMLR
%P 9434--9442
%U https://proceedings.mlr.press/v139/sentenac21a.html
%V 139
%X Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to to poly-log terms).

APA


Sentenac, F., Yi, J., Calauzenes, C., Perchet, V. & Vojnovic, M.. (2021). Pure Exploration and Regret Minimization in Matching Bandits. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:9434-9442 Available from https://proceedings.mlr.press/v139/sentenac21a.html.

Related Material

Download PDF