Efficient Neural Architecture Search via Parameters Sharing

Hieu Pham; Melody Guan; Barret Zoph; Quoc Le; Jeff Dean

Efficient Neural Architecture Search via Parameters Sharing

Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, Jeff Dean

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4095-4104, 2018.

Abstract

We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-pham18a,
  title = 	 {Efficient Neural Architecture Search via Parameters Sharing},
  author =       {Pham, Hieu and Guan, Melody and Zoph, Barret and Le, Quoc and Dean, Jeff},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {4095--4104},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/pham18a/pham18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/pham18a.html},
  abstract = 	 {We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).}
}

Endnote

%0 Conference Paper
%T Efficient Neural Architecture Search via Parameters Sharing
%A Hieu Pham
%A Melody Guan
%A Barret Zoph
%A Quoc Le
%A Jeff Dean
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-pham18a
%I PMLR
%P 4095--4104
%U https://proceedings.mlr.press/v80/pham18a.html
%V 80
%X We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).

APA


Pham, H., Guan, M., Zoph, B., Le, Q. & Dean, J.. (2018). Efficient Neural Architecture Search via Parameters Sharing. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4095-4104 Available from https://proceedings.mlr.press/v80/pham18a.html.

Efficient Neural Architecture Search via Parameters Sharing

Abstract

Cite this Paper

Related Material