Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Efe Mert Karagözlü; Yaşar Cahit Yıldırım; Cağın Ararat; Cem Tekin

Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Efe Mert Karagözlü, Yaşar Cahit Yıldırım, Cağın Ararat, Cem Tekin

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3070-3078, 2024.

Abstract

We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of

$(\epsilon, \delta)$ -PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-karagozlu24a,
  title = 	 {Learning the {P}areto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits},
  author =       {Karag{\"o}zl{\"u}, Efe Mert and Y{\i}ld{\i}r{\i}m, Ya{\c{s}}ar Cahit and Ararat, Ca\u{g}{\i}n and Tekin, Cem},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3070--3078},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/karagozlu24a/karagozlu24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/karagozlu24a.html},
  abstract = 	 {We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of $(\epsilon, \delta)$-PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.}
}

Endnote

%0 Conference Paper
%T Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits
%A Efe Mert Karagözlü
%A Yaşar Cahit Yıldırım
%A Cağın Ararat
%A Cem Tekin
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-karagozlu24a
%I PMLR
%P 3070--3078
%U https://proceedings.mlr.press/v238/karagozlu24a.html
%V 238
%X We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of $(\epsilon, \delta)$-PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.

APA

Karagözlü, E.M., Yıldırım, Y.C., Ararat, C. & Tekin, C.. (2024). Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3070-3078 Available from https://proceedings.mlr.press/v238/karagozlu24a.html.

Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Abstract

Cite this Paper

Related Material