Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets

Aaron Klein, Stefan Falkner, Simon Bartels, Philipp Hennig, Frank Hutter
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:528-536, 2017.

Abstract

Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed FABOLAS, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that FABOLAS often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-klein17a, title = {{Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets}}, author = {Klein, Aaron and Falkner, Stefan and Bartels, Simon and Hennig, Philipp and Hutter, Frank}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {528--536}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/klein17a/klein17a.pdf}, url = {https://proceedings.mlr.press/v54/klein17a.html}, abstract = {Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed FABOLAS, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that FABOLAS often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.} }
Endnote
%0 Conference Paper %T Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets %A Aaron Klein %A Stefan Falkner %A Simon Bartels %A Philipp Hennig %A Frank Hutter %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-klein17a %I PMLR %P 528--536 %U https://proceedings.mlr.press/v54/klein17a.html %V 54 %X Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed FABOLAS, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that FABOLAS often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.
APA
Klein, A., Falkner, S., Bartels, S., Hennig, P. & Hutter, F.. (2017). Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:528-536 Available from https://proceedings.mlr.press/v54/klein17a.html.

Related Material