How Graph Neural Networks Learn: Lessons from Training Dynamics

Chenxiao Yang, Qitian Wu, David Wipf, Ruoyu Sun, Junchi Yan
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:56594-56623, 2024.

Abstract

A long-standing goal in deep learning has been to characterize the learning behavior of black-box models in a more interpretable manner. For graph neural networks (GNNs), considerable advances have been made in formalizing what functions they can represent, but whether GNNs will learn desired functions during the optimization process remains less clear. To fill this gap, we study their training dynamics in function space. In particular, we find that the optimization of GNNs through gradient descent implicitly leverages the graph structure to update the learned function. This phenomenon is dubbed as kernel-graph alignment, which has been empirically and theoretically corroborated. This new analytical framework from the optimization perspective enables interpretable explanations of when and why the learned GNN functions generalize, which are relevant to their limitations on heterophilic graphs. From a practical standpoint, it also provides high-level principles for designing new algorithms. We exemplify this by showing that a simple and efficient non-parametric algorithm, obtained by explicitly using graph structure to update the learned function, can consistently compete with nonlinear GNNs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-yang24ae, title = {How Graph Neural Networks Learn: Lessons from Training Dynamics}, author = {Yang, Chenxiao and Wu, Qitian and Wipf, David and Sun, Ruoyu and Yan, Junchi}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {56594--56623}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/yang24ae/yang24ae.pdf}, url = {https://proceedings.mlr.press/v235/yang24ae.html}, abstract = {A long-standing goal in deep learning has been to characterize the learning behavior of black-box models in a more interpretable manner. For graph neural networks (GNNs), considerable advances have been made in formalizing what functions they can represent, but whether GNNs will learn desired functions during the optimization process remains less clear. To fill this gap, we study their training dynamics in function space. In particular, we find that the optimization of GNNs through gradient descent implicitly leverages the graph structure to update the learned function. This phenomenon is dubbed as kernel-graph alignment, which has been empirically and theoretically corroborated. This new analytical framework from the optimization perspective enables interpretable explanations of when and why the learned GNN functions generalize, which are relevant to their limitations on heterophilic graphs. From a practical standpoint, it also provides high-level principles for designing new algorithms. We exemplify this by showing that a simple and efficient non-parametric algorithm, obtained by explicitly using graph structure to update the learned function, can consistently compete with nonlinear GNNs.} }
Endnote
%0 Conference Paper %T How Graph Neural Networks Learn: Lessons from Training Dynamics %A Chenxiao Yang %A Qitian Wu %A David Wipf %A Ruoyu Sun %A Junchi Yan %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-yang24ae %I PMLR %P 56594--56623 %U https://proceedings.mlr.press/v235/yang24ae.html %V 235 %X A long-standing goal in deep learning has been to characterize the learning behavior of black-box models in a more interpretable manner. For graph neural networks (GNNs), considerable advances have been made in formalizing what functions they can represent, but whether GNNs will learn desired functions during the optimization process remains less clear. To fill this gap, we study their training dynamics in function space. In particular, we find that the optimization of GNNs through gradient descent implicitly leverages the graph structure to update the learned function. This phenomenon is dubbed as kernel-graph alignment, which has been empirically and theoretically corroborated. This new analytical framework from the optimization perspective enables interpretable explanations of when and why the learned GNN functions generalize, which are relevant to their limitations on heterophilic graphs. From a practical standpoint, it also provides high-level principles for designing new algorithms. We exemplify this by showing that a simple and efficient non-parametric algorithm, obtained by explicitly using graph structure to update the learned function, can consistently compete with nonlinear GNNs.
APA
Yang, C., Wu, Q., Wipf, D., Sun, R. & Yan, J.. (2024). How Graph Neural Networks Learn: Lessons from Training Dynamics. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:56594-56623 Available from https://proceedings.mlr.press/v235/yang24ae.html.

Related Material