A Neural Architecture Predictor based on GNN-Enhanced Transformer

Xunzhi Xiang, Kun Jing, Jungang Xu
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1729-1737, 2024.

Abstract

Neural architecture performance predictor is an efficient approach for architecture estimation in Neural Architecture Search (NAS). However, existing predictors based on Graph Neural Networks (GNNs) are deficient in modeling long-range interactions between operation nodes and prone to the problem of over-smoothing, which limits their ability to learn neural architecture representation. Furthermore, some Transformer-based predictors use simple position encodings to improve performance via self-attention mechanism, but they fail to fully exploit the subgraph structure information of the graph. To solve this problem, we propose a novel method to enhance the graph representation of neural architectures by combining GNNs and Transformer blocks. We evaluate the effectiveness of our predictor on NAS-Bench-101 and NAS-bench-201 benchmarks, the discovered architecture on DARTS search space achieves an accuracy of 97.61% on CIFAR-10 dataset, which outperforms traditional position encoding methods such as adjacency and Laplacian matrices. The code of our work is available at \url{https://github.com/GNET}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-xiang24a, title = {A Neural Architecture Predictor based on {GNN}-Enhanced Transformer}, author = {Xiang, Xunzhi and Jing, Kun and Xu, Jungang}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1729--1737}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/xiang24a/xiang24a.pdf}, url = {https://proceedings.mlr.press/v238/xiang24a.html}, abstract = {Neural architecture performance predictor is an efficient approach for architecture estimation in Neural Architecture Search (NAS). However, existing predictors based on Graph Neural Networks (GNNs) are deficient in modeling long-range interactions between operation nodes and prone to the problem of over-smoothing, which limits their ability to learn neural architecture representation. Furthermore, some Transformer-based predictors use simple position encodings to improve performance via self-attention mechanism, but they fail to fully exploit the subgraph structure information of the graph. To solve this problem, we propose a novel method to enhance the graph representation of neural architectures by combining GNNs and Transformer blocks. We evaluate the effectiveness of our predictor on NAS-Bench-101 and NAS-bench-201 benchmarks, the discovered architecture on DARTS search space achieves an accuracy of 97.61% on CIFAR-10 dataset, which outperforms traditional position encoding methods such as adjacency and Laplacian matrices. The code of our work is available at \url{https://github.com/GNET}.} }
Endnote
%0 Conference Paper %T A Neural Architecture Predictor based on GNN-Enhanced Transformer %A Xunzhi Xiang %A Kun Jing %A Jungang Xu %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-xiang24a %I PMLR %P 1729--1737 %U https://proceedings.mlr.press/v238/xiang24a.html %V 238 %X Neural architecture performance predictor is an efficient approach for architecture estimation in Neural Architecture Search (NAS). However, existing predictors based on Graph Neural Networks (GNNs) are deficient in modeling long-range interactions between operation nodes and prone to the problem of over-smoothing, which limits their ability to learn neural architecture representation. Furthermore, some Transformer-based predictors use simple position encodings to improve performance via self-attention mechanism, but they fail to fully exploit the subgraph structure information of the graph. To solve this problem, we propose a novel method to enhance the graph representation of neural architectures by combining GNNs and Transformer blocks. We evaluate the effectiveness of our predictor on NAS-Bench-101 and NAS-bench-201 benchmarks, the discovered architecture on DARTS search space achieves an accuracy of 97.61% on CIFAR-10 dataset, which outperforms traditional position encoding methods such as adjacency and Laplacian matrices. The code of our work is available at \url{https://github.com/GNET}.
APA
Xiang, X., Jing, K. & Xu, J.. (2024). A Neural Architecture Predictor based on GNN-Enhanced Transformer. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1729-1737 Available from https://proceedings.mlr.press/v238/xiang24a.html.

Related Material