Scalable Bayesian Optimization Using Deep Neural Networks

Jasper Snoek; Oren Rippel; Kevin Swersky; Ryan Kiros; Nadathur Satish; Narayanan Sundaram; Mostofa Patwary; Mr Prabhat; Ryan Adams

Scalable Bayesian Optimization Using Deep Neural Networks

Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Mostofa Patwary, Mr Prabhat, Ryan Adams

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2171-2180, 2015.

Abstract

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-snoek15,
  title = 	 {Scalable Bayesian Optimization Using Deep Neural Networks},
  author = 	 {Snoek, Jasper and Rippel, Oren and Swersky, Kevin and Kiros, Ryan and Satish, Nadathur and Sundaram, Narayanan and Patwary, Mostofa and Prabhat, Mr and Adams, Ryan},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {2171--2180},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/snoek15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/snoek15.html},
  abstract = 	 {Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.}
}

Endnote

%0 Conference Paper
%T Scalable Bayesian Optimization Using Deep Neural Networks
%A Jasper Snoek
%A Oren Rippel
%A Kevin Swersky
%A Ryan Kiros
%A Nadathur Satish
%A Narayanan Sundaram
%A Mostofa Patwary
%A Mr Prabhat
%A Ryan Adams
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-snoek15
%I PMLR
%P 2171--2180
%U https://proceedings.mlr.press/v37/snoek15.html
%V 37
%X Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.

RIS


TY  - CPAPER
TI  - Scalable Bayesian Optimization Using Deep Neural Networks
AU  - Jasper Snoek
AU  - Oren Rippel
AU  - Kevin Swersky
AU  - Ryan Kiros
AU  - Nadathur Satish
AU  - Narayanan Sundaram
AU  - Mostofa Patwary
AU  - Mr Prabhat
AU  - Ryan Adams
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-snoek15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 2171
EP  - 2180
L1  - http://proceedings.mlr.press/v37/snoek15.pdf
UR  - https://proceedings.mlr.press/v37/snoek15.html
AB  - Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.
ER  -

APA


Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N., Patwary, M., Prabhat, M. & Adams, R.. (2015). Scalable Bayesian Optimization Using Deep Neural Networks. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2171-2180 Available from https://proceedings.mlr.press/v37/snoek15.html.

Scalable Bayesian Optimization Using Deep Neural Networks

Abstract

Cite this Paper

Related Material