Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits

Shreyas S; Aadirupa Saha; Chiranjib Bhattacharyya

Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits

Shreyas S, Aadirupa Saha, Chiranjib Bhattacharyya

Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR 115:595-605, 2020.

Abstract

We study the classical linear bandit problem on \emph{graphs} modelling arm rewards through an underlying graph structure $G$($N$,$E$) such that rewards of neighboring nodes are similar. Previous attempts along this line have primarily considered the arm rewards to be a smooth function over graph Laplacian, which however failed to characterize the inherent problem complexity in terms of the graph structure. We bridge this gap by showing a regret guarantee of $\tO(\chi(\overline{G})\sqrt{T})$ ($\tO(\cdot)$ notation hides dependencies on $\log T$), that scales only with the chromatic number of the complement graph $\chi(\overline{G})$, assuming the rewards to be a smooth function over a general class of graph embeddings—\emph{Orthonormal Representations}. Our proposed algorithms yield a regret guarantee of $\tilde O(r\sqrt T)$ for any general embedding of rank $r$. Furthermore, if the rewards correspond to a minimum rank embedding, the regret boils down to $O(\chi(\overline{G})\sqrt{T})$—none of the existing works were able to bring out such influences of graph structures over arm rewards. Finally noting that computing the above minimum rank embedding is NP-Hard, we also propose an alternative $O(N + |E|)$ time computable embedding scheme—{\it Greedy Embeddings}—based on greedy graph coloring, on which our algorithms perform optimally on a large family of graphs, e.g. union of cliques, complement of $k$-colorable graphs, regular graphs etc, and are also shown to outperform the state-of-the-art methods on real datasets. Our findings open up new roads for exploiting graph structures on regret performance.

Cite this Paper

BibTeX


@InProceedings{pmlr-v115-s20a,
  title = 	 {Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits},
  author =       {S, Shreyas and Saha, Aadirupa and Bhattacharyya, Chiranjib},
  booktitle = 	 {Proceedings of The 35th Uncertainty in Artificial Intelligence Conference},
  pages = 	 {595--605},
  year = 	 {2020},
  editor = 	 {Adams, Ryan P. and Gogate, Vibhav},
  volume = 	 {115},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {22--25 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v115/s20a/s20a.pdf},
  url = 	 {https://proceedings.mlr.press/v115/s20a.html},
  abstract = 	 {We study the classical linear bandit problem on \emph{graphs} modelling arm rewards through an underlying graph structure $G$($N$,$E$) such that rewards of neighboring nodes are similar. Previous attempts along this line have primarily considered the arm rewards to be a smooth function over graph Laplacian, which however failed to characterize the inherent problem complexity in terms of the graph structure. We bridge this gap by showing a regret guarantee of $\tO(\chi(\overline{G})\sqrt{T})$ ($\tO(\cdot)$ notation hides dependencies on $\log T$), that scales only with the chromatic number of the complement graph $\chi(\overline{G})$, assuming the rewards to be a smooth function over a general class of graph embeddings—\emph{Orthonormal Representations}. Our proposed algorithms yield a regret guarantee of $\tilde O(r\sqrt T)$ for any general embedding of rank $r$. Furthermore, if the rewards correspond to a minimum rank embedding, the regret boils down to $O(\chi(\overline{G})\sqrt{T})$—none of the existing works were able to bring out such influences of graph structures over arm rewards. Finally noting that computing the above minimum rank embedding is NP-Hard, we also propose an alternative $O(N + |E|)$ time computable embedding scheme—{\it Greedy Embeddings}—based on greedy graph coloring, on which our algorithms perform optimally on a large family of graphs, e.g. union of cliques, complement of $k$-colorable graphs, regular graphs etc, and are also shown to  outperform the state-of-the-art methods on real datasets. Our findings open up new roads for exploiting graph structures on regret performance.}
}

Endnote

%0 Conference Paper
%T Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits
%A Shreyas S
%A Aadirupa Saha
%A Chiranjib Bhattacharyya
%B Proceedings of The 35th Uncertainty in Artificial Intelligence Conference
%C Proceedings of Machine Learning Research
%D 2020
%E Ryan P. Adams
%E Vibhav Gogate	
%F pmlr-v115-s20a
%I PMLR
%P 595--605
%U https://proceedings.mlr.press/v115/s20a.html
%V 115
%X We study the classical linear bandit problem on \emph{graphs} modelling arm rewards through an underlying graph structure $G$($N$,$E$) such that rewards of neighboring nodes are similar. Previous attempts along this line have primarily considered the arm rewards to be a smooth function over graph Laplacian, which however failed to characterize the inherent problem complexity in terms of the graph structure. We bridge this gap by showing a regret guarantee of $\tO(\chi(\overline{G})\sqrt{T})$ ($\tO(\cdot)$ notation hides dependencies on $\log T$), that scales only with the chromatic number of the complement graph $\chi(\overline{G})$, assuming the rewards to be a smooth function over a general class of graph embeddings—\emph{Orthonormal Representations}. Our proposed algorithms yield a regret guarantee of $\tilde O(r\sqrt T)$ for any general embedding of rank $r$. Furthermore, if the rewards correspond to a minimum rank embedding, the regret boils down to $O(\chi(\overline{G})\sqrt{T})$—none of the existing works were able to bring out such influences of graph structures over arm rewards. Finally noting that computing the above minimum rank embedding is NP-Hard, we also propose an alternative $O(N + |E|)$ time computable embedding scheme—{\it Greedy Embeddings}—based on greedy graph coloring, on which our algorithms perform optimally on a large family of graphs, e.g. union of cliques, complement of $k$-colorable graphs, regular graphs etc, and are also shown to  outperform the state-of-the-art methods on real datasets. Our findings open up new roads for exploiting graph structures on regret performance.

APA


S, S., Saha, A. & Bhattacharyya, C.. (2020). Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits. Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:595-605 Available from https://proceedings.mlr.press/v115/s20a.html.

Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits

Abstract

Cite this Paper

Related Material