Matroid Semi-Bandits in Sublinear Time

Ruo-Chun Tzeng; Naoto Ohsaka; Kaito Ariu

Matroid Semi-Bandits in Sublinear Time

Ruo-Chun Tzeng, Naoto Ohsaka, Kaito Ariu

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:48855-48877, 2024.

Abstract

We study the matroid semi-bandits problem, where at each round the learner plays a subset of

$K$ arms from a feasible set, and the goal is to maximize the expected cumulative linear rewards. Existing algorithms have per-round time complexity at least

$\Omega(K)$ , which becomes expensive when

$K$ is large. To address this computational issue, we propose FasterCUCB whose sampling rule takes time sublinear in

$K$ for common classes of matroids:

$\mathcal{O}(D\text{ polylog}(K)\text{ polylog}(T))$ for uniform matroids, partition matroids, and graphical matroids, and

$\mathcal{O}(D\sqrt{K}\text{ polylog}(T))$ for transversal matroids. Here,

$D$ is the maximum number of elements in any feasible subset of arms, and

$T$ is the horizon. Our technique is based on dynamic maintenance of an approximate maximum-weight basis over inner-product weights. Although the introduction of an approximate maximum-weight basis presents a challenge in regret analysis, we can still guarantee an upper bound on regret as tight as CUCB in the sense that it matches the gap-dependent lower bound by Kveton et al. (2014a) asymptotically.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-tzeng24a,
  title = 	 {Matroid Semi-Bandits in Sublinear Time},
  author =       {Tzeng, Ruo-Chun and Ohsaka, Naoto and Ariu, Kaito},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {48855--48877},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/tzeng24a/tzeng24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/tzeng24a.html},
  abstract = 	 {We study the matroid semi-bandits problem, where at each round the learner plays a subset of $K$ arms from a feasible set, and the goal is to maximize the expected cumulative linear rewards. Existing algorithms have per-round time complexity at least $\Omega(K)$, which becomes expensive when $K$ is large. To address this computational issue, we propose FasterCUCB whose sampling rule takes time sublinear in $K$ for common classes of matroids: $\mathcal{O}(D\text{ polylog}(K)\text{ polylog}(T))$ for uniform matroids, partition matroids, and graphical matroids, and $\mathcal{O}(D\sqrt{K}\text{ polylog}(T))$ for transversal matroids. Here, $D$ is the maximum number of elements in any feasible subset of arms, and $T$ is the horizon. Our technique is based on dynamic maintenance of an approximate maximum-weight basis over inner-product weights. Although the introduction of an approximate maximum-weight basis presents a challenge in regret analysis, we can still guarantee an upper bound on regret as tight as CUCB in the sense that it matches the gap-dependent lower bound by Kveton et al. (2014a) asymptotically.}
}

Endnote

%0 Conference Paper
%T Matroid Semi-Bandits in Sublinear Time
%A Ruo-Chun Tzeng
%A Naoto Ohsaka
%A Kaito Ariu
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-tzeng24a
%I PMLR
%P 48855--48877
%U https://proceedings.mlr.press/v235/tzeng24a.html
%V 235
%X We study the matroid semi-bandits problem, where at each round the learner plays a subset of $K$ arms from a feasible set, and the goal is to maximize the expected cumulative linear rewards. Existing algorithms have per-round time complexity at least $\Omega(K)$, which becomes expensive when $K$ is large. To address this computational issue, we propose FasterCUCB whose sampling rule takes time sublinear in $K$ for common classes of matroids: $\mathcal{O}(D\text{ polylog}(K)\text{ polylog}(T))$ for uniform matroids, partition matroids, and graphical matroids, and $\mathcal{O}(D\sqrt{K}\text{ polylog}(T))$ for transversal matroids. Here, $D$ is the maximum number of elements in any feasible subset of arms, and $T$ is the horizon. Our technique is based on dynamic maintenance of an approximate maximum-weight basis over inner-product weights. Although the introduction of an approximate maximum-weight basis presents a challenge in regret analysis, we can still guarantee an upper bound on regret as tight as CUCB in the sense that it matches the gap-dependent lower bound by Kveton et al. (2014a) asymptotically.

APA


Tzeng, R., Ohsaka, N. & Ariu, K.. (2024). Matroid Semi-Bandits in Sublinear Time. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:48855-48877 Available from https://proceedings.mlr.press/v235/tzeng24a.html.

Matroid Semi-Bandits in Sublinear Time

Abstract

Cite this Paper

Related Material