Weight-Entanglement Meets Gradient-Based Neural Architecture Search

Rhea Sanjay Sukthanker, Arjun Krishnakumar, Mahmoud Safari, Frank Hutter
Proceedings of the Third International Conference on Automated Machine Learning, PMLR 256:12/1-25, 2024.

Abstract

Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architectural spaces significantly faster than traditional blackbox approaches. In parallel, weight entanglement has emerged as a technique for more intricate parameter sharing amongst macro-architectural spaces. Since weight-entanglement is not directly compatible with gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods, while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible at \url{https://anon-github.automl.cc/r/TangleNAS-5BA5}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v256-sukthanker24a, title = {Weight-Entanglement Meets Gradient-Based Neural Architecture Search}, author = {Sukthanker, Rhea Sanjay and Krishnakumar, Arjun and Safari, Mahmoud and Hutter, Frank}, booktitle = {Proceedings of the Third International Conference on Automated Machine Learning}, pages = {12/1--25}, year = {2024}, editor = {Eggensperger, Katharina and Garnett, Roman and Vanschoren, Joaquin and Lindauer, Marius and Gardner, Jacob R.}, volume = {256}, series = {Proceedings of Machine Learning Research}, month = {09--12 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v256/main/assets/sukthanker24a/sukthanker24a.pdf}, url = {https://proceedings.mlr.press/v256/sukthanker24a.html}, abstract = {Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architectural spaces significantly faster than traditional blackbox approaches. In parallel, weight entanglement has emerged as a technique for more intricate parameter sharing amongst macro-architectural spaces. Since weight-entanglement is not directly compatible with gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods, while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible at \url{https://anon-github.automl.cc/r/TangleNAS-5BA5}.} }
Endnote
%0 Conference Paper %T Weight-Entanglement Meets Gradient-Based Neural Architecture Search %A Rhea Sanjay Sukthanker %A Arjun Krishnakumar %A Mahmoud Safari %A Frank Hutter %B Proceedings of the Third International Conference on Automated Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Katharina Eggensperger %E Roman Garnett %E Joaquin Vanschoren %E Marius Lindauer %E Jacob R. Gardner %F pmlr-v256-sukthanker24a %I PMLR %P 12/1--25 %U https://proceedings.mlr.press/v256/sukthanker24a.html %V 256 %X Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architectural spaces significantly faster than traditional blackbox approaches. In parallel, weight entanglement has emerged as a technique for more intricate parameter sharing amongst macro-architectural spaces. Since weight-entanglement is not directly compatible with gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods, while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible at \url{https://anon-github.automl.cc/r/TangleNAS-5BA5}.
APA
Sukthanker, R.S., Krishnakumar, A., Safari, M. & Hutter, F.. (2024). Weight-Entanglement Meets Gradient-Based Neural Architecture Search. Proceedings of the Third International Conference on Automated Machine Learning, in Proceedings of Machine Learning Research 256:12/1-25 Available from https://proceedings.mlr.press/v256/sukthanker24a.html.

Related Material