[edit]
A Neural Architecture Predictor based on GNN-Enhanced Transformer
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1729-1737, 2024.
Abstract
Neural architecture performance predictor is an efficient approach for architecture estimation in Neural Architecture Search (NAS). However, existing predictors based on Graph Neural Networks (GNNs) are deficient in modeling long-range interactions between operation nodes and prone to the problem of over-smoothing, which limits their ability to learn neural architecture representation. Furthermore, some Transformer-based predictors use simple position encodings to improve performance via self-attention mechanism, but they fail to fully exploit the subgraph structure information of the graph. To solve this problem, we propose a novel method to enhance the graph representation of neural architectures by combining GNNs and Transformer blocks. We evaluate the effectiveness of our predictor on NAS-Bench-101 and NAS-bench-201 benchmarks, the discovered architecture on DARTS search space achieves an accuracy of 97.61% on CIFAR-10 dataset, which outperforms traditional position encoding methods such as adjacency and Laplacian matrices. The code of our work is available at \url{https://github.com/GNET}.