Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks

Rui Xue; Tong Zhao; Neil Shah; Xiaorui Liu

Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks

Rui Xue, Tong Zhao, Neil Shah, Xiaorui Liu

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:70160-70182, 2025.

Abstract

Graph neural networks (GNNs) have demonstrated remarkable success in graph representation learning and various sampling approaches have been proposed to scale GNNs to applications with large-scale graphs. A class of promising GNN training algorithms take advantage of historical embeddings to reduce the computation and memory cost while maintaining the model expressiveness of GNNs. However, they incur significant computation bias due to the stale feature history. In this paper, we provide a comprehensive analysis of their staleness and inferior performance on large-scale problems. Motivated by our discoveries, we propose a simple yet highly effective training algorithm (REST) to effectively reduce feature staleness, which leads to significantly improved performance and convergence across varying batch sizes, especially when staleness is predominant. The proposed algorithm seamlessly integrates with existing solutions, boasting easy implementation, while comprehensive experiments underscore its superior performance and efficiency on large-scale benchmarks. Specifically, our improvements to state-of-the-art historical embedding methods result in a 2.7% and 3.6% performance enhancement on the ogbn-papers100M and ogbn-products dataset respectively, accompanied by notably accelerated convergence. The code can be found at https://github.com/RXPHD/REST.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-xue25e,
  title = 	 {Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks},
  author =       {Xue, Rui and Zhao, Tong and Shah, Neil and Liu, Xiaorui},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {70160--70182},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/xue25e/xue25e.pdf},
  url = 	 {https://proceedings.mlr.press/v267/xue25e.html},
  abstract = 	 {Graph neural networks (GNNs) have demonstrated remarkable success in graph representation learning and various sampling approaches have been proposed to scale GNNs to applications with large-scale graphs. A class of promising GNN training algorithms take advantage of historical embeddings to reduce the computation and memory cost while maintaining the model expressiveness of GNNs. However, they incur significant computation bias due to the stale feature history. In this paper, we provide a comprehensive analysis of their staleness and inferior performance on large-scale problems. Motivated by our discoveries, we propose a simple yet highly effective training algorithm (REST) to effectively reduce feature staleness, which leads to significantly improved performance and convergence across varying batch sizes, especially when staleness is predominant. The proposed algorithm seamlessly integrates with existing solutions, boasting easy implementation, while comprehensive experiments underscore its superior performance and efficiency on large-scale benchmarks. Specifically, our improvements to state-of-the-art historical embedding methods result in a 2.7% and 3.6% performance enhancement on the ogbn-papers100M and ogbn-products dataset respectively, accompanied by notably accelerated convergence. The code can be found at https://github.com/RXPHD/REST.}
}

Endnote

%0 Conference Paper
%T Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks
%A Rui Xue
%A Tong Zhao
%A Neil Shah
%A Xiaorui Liu
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-xue25e
%I PMLR
%P 70160--70182
%U https://proceedings.mlr.press/v267/xue25e.html
%V 267
%X Graph neural networks (GNNs) have demonstrated remarkable success in graph representation learning and various sampling approaches have been proposed to scale GNNs to applications with large-scale graphs. A class of promising GNN training algorithms take advantage of historical embeddings to reduce the computation and memory cost while maintaining the model expressiveness of GNNs. However, they incur significant computation bias due to the stale feature history. In this paper, we provide a comprehensive analysis of their staleness and inferior performance on large-scale problems. Motivated by our discoveries, we propose a simple yet highly effective training algorithm (REST) to effectively reduce feature staleness, which leads to significantly improved performance and convergence across varying batch sizes, especially when staleness is predominant. The proposed algorithm seamlessly integrates with existing solutions, boasting easy implementation, while comprehensive experiments underscore its superior performance and efficiency on large-scale benchmarks. Specifically, our improvements to state-of-the-art historical embedding methods result in a 2.7% and 3.6% performance enhancement on the ogbn-papers100M and ogbn-products dataset respectively, accompanied by notably accelerated convergence. The code can be found at https://github.com/RXPHD/REST.

APA

Xue, R., Zhao, T., Shah, N. & Liu, X.. (2025). Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:70160-70182 Available from https://proceedings.mlr.press/v267/xue25e.html.

Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks

Abstract

Cite this Paper

Related Material