YAHPO Gym - An Efficient Multi-Objective Multi-Fidelity Benchmark for Hyperparameter Optimization

Florian Pfisterer, Lennart Schneider, Julia Moosbauer, Martin Binder, Bernd Bischl
Proceedings of the First International Conference on Automated Machine Learning, PMLR 188:3/1-39, 2022.

Abstract

When developing and analyzing new hyperparameter optimization methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we propose a new set of challenging and relevant benchmark problems motivated by desirable properties and requirements for such benchmarks. Our new surrogate-based benchmark collection consists of 14 scenarios that in total constitute over 700 multi-fidelity hyperparameter optimization problems, which all enable multi-objective hyperparameter optimization. Furthermore, we empirically compare surrogate-based benchmarks to the more widely-used tabular benchmarks, and demonstrate that the latter may produce unfaithful results regarding the performance ranking of HPO methods. We examine and compare our benchmark collection with respect to defined requirements and propose a single-objective as well as a multi-objective benchmark suite on which we compare 7 single-objective and 7 multi-objective optimizers in a benchmark experiment. Our software is available at \url{https://github.com/slds-lmu/yahpo_gym}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v188-pfisterer22a, title = {YAHPO Gym - An Efficient Multi-Objective Multi-Fidelity Benchmark for Hyperparameter Optimization}, author = {Pfisterer, Florian and Schneider, Lennart and Moosbauer, Julia and Binder, Martin and Bischl, Bernd}, booktitle = {Proceedings of the First International Conference on Automated Machine Learning}, pages = {3/1--39}, year = {2022}, editor = {Guyon, Isabelle and Lindauer, Marius and van der Schaar, Mihaela and Hutter, Frank and Garnett, Roman}, volume = {188}, series = {Proceedings of Machine Learning Research}, month = {25--27 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v188/pfisterer22a/pfisterer22a.pdf}, url = {https://proceedings.mlr.press/v188/pfisterer22a.html}, abstract = {When developing and analyzing new hyperparameter optimization methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we propose a new set of challenging and relevant benchmark problems motivated by desirable properties and requirements for such benchmarks. Our new surrogate-based benchmark collection consists of 14 scenarios that in total constitute over 700 multi-fidelity hyperparameter optimization problems, which all enable multi-objective hyperparameter optimization. Furthermore, we empirically compare surrogate-based benchmarks to the more widely-used tabular benchmarks, and demonstrate that the latter may produce unfaithful results regarding the performance ranking of HPO methods. We examine and compare our benchmark collection with respect to defined requirements and propose a single-objective as well as a multi-objective benchmark suite on which we compare 7 single-objective and 7 multi-objective optimizers in a benchmark experiment. Our software is available at \url{https://github.com/slds-lmu/yahpo_gym}.} }
Endnote
%0 Conference Paper %T YAHPO Gym - An Efficient Multi-Objective Multi-Fidelity Benchmark for Hyperparameter Optimization %A Florian Pfisterer %A Lennart Schneider %A Julia Moosbauer %A Martin Binder %A Bernd Bischl %B Proceedings of the First International Conference on Automated Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Isabelle Guyon %E Marius Lindauer %E Mihaela van der Schaar %E Frank Hutter %E Roman Garnett %F pmlr-v188-pfisterer22a %I PMLR %P 3/1--39 %U https://proceedings.mlr.press/v188/pfisterer22a.html %V 188 %X When developing and analyzing new hyperparameter optimization methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we propose a new set of challenging and relevant benchmark problems motivated by desirable properties and requirements for such benchmarks. Our new surrogate-based benchmark collection consists of 14 scenarios that in total constitute over 700 multi-fidelity hyperparameter optimization problems, which all enable multi-objective hyperparameter optimization. Furthermore, we empirically compare surrogate-based benchmarks to the more widely-used tabular benchmarks, and demonstrate that the latter may produce unfaithful results regarding the performance ranking of HPO methods. We examine and compare our benchmark collection with respect to defined requirements and propose a single-objective as well as a multi-objective benchmark suite on which we compare 7 single-objective and 7 multi-objective optimizers in a benchmark experiment. Our software is available at \url{https://github.com/slds-lmu/yahpo_gym}.
APA
Pfisterer, F., Schneider, L., Moosbauer, J., Binder, M. & Bischl, B.. (2022). YAHPO Gym - An Efficient Multi-Objective Multi-Fidelity Benchmark for Hyperparameter Optimization. Proceedings of the First International Conference on Automated Machine Learning, in Proceedings of Machine Learning Research 188:3/1-39 Available from https://proceedings.mlr.press/v188/pfisterer22a.html.

Related Material