Combinatorial Semi-Bandits with Knapsacks

Karthik Abinav Sankararaman; Aleksandrs Slivkins

Combinatorial Semi-Bandits with Knapsacks

Karthik Abinav Sankararaman, Aleksandrs Slivkins

Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:1760-1770, 2018.

Abstract

We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks (BwK) and combinatorial semi-bandits. The former concerns limited “resources" consumed by the algorithm, e.g., limited supply in dynamic pricing. The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable. We define a common generalization, support it with several motivating examples, and design an algorithm for it. Our regret bounds are comparable with those for BwK and combinatorial semi-bandits.

Cite this Paper

BibTeX


@InProceedings{pmlr-v84-sankararaman18a,
  title = 	 {Combinatorial Semi-Bandits with Knapsacks},
  author = 	 {Sankararaman, Karthik Abinav and Slivkins, Aleksandrs},
  booktitle = 	 {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1760--1770},
  year = 	 {2018},
  editor = 	 {Storkey, Amos and Perez-Cruz, Fernando},
  volume = 	 {84},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--11 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v84/sankararaman18a/sankararaman18a.pdf},
  url = 	 {https://proceedings.mlr.press/v84/sankararaman18a.html},
  abstract = 	 {We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks (BwK) and combinatorial semi-bandits. The former concerns limited “resources" consumed by the algorithm, e.g., limited supply in dynamic pricing. The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable. We define a common generalization, support it with several motivating examples, and design an algorithm for it. Our regret bounds are comparable with those for BwK and combinatorial semi-bandits. }
}

Endnote

%0 Conference Paper
%T Combinatorial Semi-Bandits with Knapsacks
%A Karthik Abinav Sankararaman
%A Aleksandrs Slivkins
%B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2018
%E Amos Storkey
%E Fernando Perez-Cruz	
%F pmlr-v84-sankararaman18a
%I PMLR
%P 1760--1770
%U https://proceedings.mlr.press/v84/sankararaman18a.html
%V 84
%X We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks (BwK) and combinatorial semi-bandits. The former concerns limited “resources" consumed by the algorithm, e.g., limited supply in dynamic pricing. The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable. We define a common generalization, support it with several motivating examples, and design an algorithm for it. Our regret bounds are comparable with those for BwK and combinatorial semi-bandits.

APA


Sankararaman, K.A. & Slivkins, A.. (2018). Combinatorial Semi-Bandits with Knapsacks. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1760-1770 Available from https://proceedings.mlr.press/v84/sankararaman18a.html.

Combinatorial Semi-Bandits with Knapsacks

Abstract

Cite this Paper

Related Material