Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Efe Mert Karagözlü, Yaşar Cahit Yıldırım, Cağın Ararat, Cem Tekin
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3070-3078, 2024.

Abstract

We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of (ϵ,δ)-PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-karagozlu24a, title = {Learning the {P}areto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits}, author = {Karag{\"o}zl{\"u}, Efe Mert and Y{\i}ld{\i}r{\i}m, Ya{\c{s}}ar Cahit and Ararat, Ca\u{g}{\i}n and Tekin, Cem}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3070--3078}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/karagozlu24a/karagozlu24a.pdf}, url = {https://proceedings.mlr.press/v238/karagozlu24a.html}, abstract = {We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of $(\epsilon, \delta)$-PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.} }
Endnote
%0 Conference Paper %T Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits %A Efe Mert Karagözlü %A Yaşar Cahit Yıldırım %A Cağın Ararat %A Cem Tekin %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-karagozlu24a %I PMLR %P 3070--3078 %U https://proceedings.mlr.press/v238/karagozlu24a.html %V 238 %X We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of $(\epsilon, \delta)$-PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.
APA
Karagözlü, E.M., Yıldırım, Y.C., Ararat, C. & Tekin, C.. (2024). Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3070-3078 Available from https://proceedings.mlr.press/v238/karagozlu24a.html.

Related Material