Sparse Dueling Bandits

Kevin Jamieson; Sumeet Katariya; Atul Deshpande; Robert Nowak

Sparse Dueling Bandits

Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:416-424, 2015.

Abstract

The dueling bandit problem is a variation of the classical multi-armed bandit in which the allowable actions are noisy comparisons between pairs of arms. This paper focuses on a new approach for finding the best arm according to the Borda criterion using noisy comparisons. We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse gaps squared of the Borda scores of each arm. We explore this dependence further and consider structural constraints on the pairwise comparison matrix (a particular form of sparsity natural to this problem) that can significantly reduce the sample complexity. This motivates a new algorithm called Successive Elimination with Comparison Sparsity (SECS) that exploits sparsity to find the Borda winner using fewer samples than standard algorithms. We also evaluate the new algorithm experimentally with synthetic and real data. The results show that the sparsity model and the new algorithm can provide significant improvements over standard approaches.

Cite this Paper

BibTeX


@InProceedings{pmlr-v38-jamieson15,
  title = 	 {{Sparse Dueling Bandits}},
  author = 	 {Jamieson, Kevin and Katariya, Sumeet and Deshpande, Atul and Nowak, Robert},
  booktitle = 	 {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {416--424},
  year = 	 {2015},
  editor = 	 {Lebanon, Guy and Vishwanathan, S. V. N.},
  volume = 	 {38},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {San Diego, California, USA},
  month = 	 {09--12 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v38/jamieson15.pdf},
  url = 	 {https://proceedings.mlr.press/v38/jamieson15.html},
  abstract = 	 {The dueling bandit problem is a variation of the classical multi-armed bandit in which the allowable actions are noisy comparisons between pairs of arms. This paper focuses on a new approach for finding the best arm according to the Borda criterion using noisy comparisons. We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse gaps squared of the Borda scores of each arm. We explore this dependence further and consider structural constraints on the pairwise comparison matrix (a particular form of sparsity natural to this problem) that can significantly reduce the sample complexity. This motivates a new algorithm called Successive Elimination with Comparison Sparsity (SECS) that exploits sparsity to find the Borda winner using fewer samples than standard algorithms. We also evaluate the new algorithm experimentally with synthetic and real data. The results show that the sparsity model and the new algorithm can provide significant improvements over standard approaches.}
}

Endnote

%0 Conference Paper
%T Sparse Dueling Bandits
%A Kevin Jamieson
%A Sumeet Katariya
%A Atul Deshpande
%A Robert Nowak
%B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2015
%E Guy Lebanon
%E S. V. N. Vishwanathan	
%F pmlr-v38-jamieson15
%I PMLR
%P 416--424
%U https://proceedings.mlr.press/v38/jamieson15.html
%V 38
%X The dueling bandit problem is a variation of the classical multi-armed bandit in which the allowable actions are noisy comparisons between pairs of arms. This paper focuses on a new approach for finding the best arm according to the Borda criterion using noisy comparisons. We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse gaps squared of the Borda scores of each arm. We explore this dependence further and consider structural constraints on the pairwise comparison matrix (a particular form of sparsity natural to this problem) that can significantly reduce the sample complexity. This motivates a new algorithm called Successive Elimination with Comparison Sparsity (SECS) that exploits sparsity to find the Borda winner using fewer samples than standard algorithms. We also evaluate the new algorithm experimentally with synthetic and real data. The results show that the sparsity model and the new algorithm can provide significant improvements over standard approaches.

RIS


TY  - CPAPER
TI  - Sparse Dueling Bandits
AU  - Kevin Jamieson
AU  - Sumeet Katariya
AU  - Atul Deshpande
AU  - Robert Nowak
BT  - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
DA  - 2015/02/21
ED  - Guy Lebanon
ED  - S. V. N. Vishwanathan	
ID  - pmlr-v38-jamieson15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 38
SP  - 416
EP  - 424
L1  - http://proceedings.mlr.press/v38/jamieson15.pdf
UR  - https://proceedings.mlr.press/v38/jamieson15.html
AB  - The dueling bandit problem is a variation of the classical multi-armed bandit in which the allowable actions are noisy comparisons between pairs of arms. This paper focuses on a new approach for finding the best arm according to the Borda criterion using noisy comparisons. We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse gaps squared of the Borda scores of each arm. We explore this dependence further and consider structural constraints on the pairwise comparison matrix (a particular form of sparsity natural to this problem) that can significantly reduce the sample complexity. This motivates a new algorithm called Successive Elimination with Comparison Sparsity (SECS) that exploits sparsity to find the Borda winner using fewer samples than standard algorithms. We also evaluate the new algorithm experimentally with synthetic and real data. The results show that the sparsity model and the new algorithm can provide significant improvements over standard approaches.
ER  -

APA


Jamieson, K., Katariya, S., Deshpande, A. & Nowak, R.. (2015). Sparse Dueling Bandits. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:416-424 Available from https://proceedings.mlr.press/v38/jamieson15.html.

Sparse Dueling Bandits

Abstract

Cite this Paper

Related Material