Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling

Setareh Ariafar; Zelda Mariet; Dana Brooks; Jennifer Dy; Jasper Snoek

Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks, Jennifer Dy, Jasper Snoek

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3961-3969, 2021.

Abstract

Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. To address this issue, Bayesian optimization methods have been extended to use cheap, partially trained models to extrapolate to expensive complete models. While this approach enlarges the set of explored hyperparameters, including many low-fidelity observations adds to the intrinsic randomness of the procedure and makes extrapolation challenging. We propose to accelerate hyperparameter tuning for neural networks in a robust way by taking into account the relative amount of information contributed by each training example. To do so, we integrate importance sampling with Bayesian optimization, which significantly increases the quality of the black-box function evaluations and their runtime. To overcome the additional overhead cost of using importance sampling, we cast hyperparameter search as a multi-task Bayesian optimization problem over both hyperparameters and importance sampling design, which achieves the best of both worlds. Through learning a trade-off between training complexity and quality, our method improves upon validation error, in the average and worst-case. We show that this results in more reliable performance of our method in less wall-clock time across a variety of and datasets complex neural architectures.

Cite this Paper

BibTeX

@InProceedings{pmlr-v130-ariafar21a,
  title = 	 { Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling },
  author =       {Ariafar, Setareh and Mariet, Zelda and Brooks, Dana and Dy, Jennifer and Snoek, Jasper},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {3961--3969},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/ariafar21a/ariafar21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/ariafar21a.html},
  abstract = 	 { Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. To address this issue, Bayesian optimization methods have been extended to use cheap, partially trained models to extrapolate to expensive complete models. While this approach enlarges the set of explored hyperparameters, including many low-fidelity observations adds to the intrinsic randomness of the procedure and makes extrapolation challenging. We propose to accelerate hyperparameter tuning for neural networks in a robust way by taking into account the relative amount of information contributed by each training example. To do so, we integrate importance sampling with Bayesian optimization, which significantly increases the quality of the black-box function evaluations and their runtime. To overcome the additional overhead cost of using importance sampling, we cast hyperparameter search as a multi-task Bayesian optimization problem over both hyperparameters and importance sampling design, which achieves the best of both worlds. Through learning a trade-off between training complexity and quality, our method improves upon validation error, in the average and worst-case. We show that this results in more reliable performance of our method in less wall-clock time across a variety of and datasets complex neural architectures. }
}

Endnote

%0 Conference Paper
%T  Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling 
%A Setareh Ariafar
%A Zelda Mariet
%A Dana Brooks
%A Jennifer Dy
%A Jasper Snoek
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-ariafar21a
%I PMLR
%P 3961--3969
%U https://proceedings.mlr.press/v130/ariafar21a.html
%V 130
%X  Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. To address this issue, Bayesian optimization methods have been extended to use cheap, partially trained models to extrapolate to expensive complete models. While this approach enlarges the set of explored hyperparameters, including many low-fidelity observations adds to the intrinsic randomness of the procedure and makes extrapolation challenging. We propose to accelerate hyperparameter tuning for neural networks in a robust way by taking into account the relative amount of information contributed by each training example. To do so, we integrate importance sampling with Bayesian optimization, which significantly increases the quality of the black-box function evaluations and their runtime. To overcome the additional overhead cost of using importance sampling, we cast hyperparameter search as a multi-task Bayesian optimization problem over both hyperparameters and importance sampling design, which achieves the best of both worlds. Through learning a trade-off between training complexity and quality, our method improves upon validation error, in the average and worst-case. We show that this results in more reliable performance of our method in less wall-clock time across a variety of and datasets complex neural architectures.

APA

Ariafar, S., Mariet, Z., Brooks, D., Dy, J. & Snoek, J.. (2021).  Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3961-3969 Available from https://proceedings.mlr.press/v130/ariafar21a.html.

Faster & More Reliable Tuning of Neural Networks: Bayesian Optimization with Importance Sampling

Abstract

Cite this Paper

Related Material