Graph Neural Network Generalization With Gaussian Mixture Model Based Augmentation

Yassine Abbahaddou; Fragkiskos D. Malliaros; Johannes F. Lutzeyer; Amine M. Aboussalah; Michalis Vazirgiannis

Graph Neural Network Generalization With Gaussian Mixture Model Based Augmentation

Yassine Abbahaddou, Fragkiskos D. Malliaros, Johannes F. Lutzeyer, Amine M. Aboussalah, Michalis Vazirgiannis

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:102-125, 2025.

Abstract

Graph Neural Networks (GNNs) have shown great promise in tasks like node and graph classification, but they often struggle to generalize, particularly to unseen or out-of-distribution (OOD) data. These challenges are exacerbated when training data is limited in size or diversity. To address these issues, we introduce a theoretical framework using Rademacher complexity to compute a regret bound on the generalization error and then characterize the effect of data augmentation. This framework informs the design of GRATIN, an efficient graph data augmentation algorithm leveraging the capability of Gaussian Mixture Models (GMMs) to approximate any distribution. Our approach not only outperforms existing augmentation techniques in terms of generalization but also offers improved time complexity, making it highly suitable for real-world applications.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-abbahaddou25a,
  title = 	 {Graph Neural Network Generalization With {G}aussian Mixture Model Based Augmentation},
  author =       {Abbahaddou, Yassine and Malliaros, Fragkiskos D. and Lutzeyer, Johannes F. and Aboussalah, Amine M. and Vazirgiannis, Michalis},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {102--125},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/abbahaddou25a/abbahaddou25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/abbahaddou25a.html},
  abstract = 	 {Graph Neural Networks (GNNs) have shown great promise in tasks like node and graph classification, but they often struggle to generalize, particularly to unseen or out-of-distribution (OOD) data. These challenges are exacerbated when training data is limited in size or diversity. To address these issues, we introduce a theoretical framework using Rademacher complexity to compute a regret bound on the generalization error and then characterize the effect of data augmentation. This framework informs the design of GRATIN, an efficient graph data augmentation algorithm leveraging the capability of Gaussian Mixture Models (GMMs) to approximate any distribution. Our approach not only outperforms existing augmentation techniques in terms of generalization but also offers improved time complexity, making it highly suitable for real-world applications.}
}

Endnote

%0 Conference Paper
%T Graph Neural Network Generalization With Gaussian Mixture Model Based Augmentation
%A Yassine Abbahaddou
%A Fragkiskos D. Malliaros
%A Johannes F. Lutzeyer
%A Amine M. Aboussalah
%A Michalis Vazirgiannis
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-abbahaddou25a
%I PMLR
%P 102--125
%U https://proceedings.mlr.press/v267/abbahaddou25a.html
%V 267
%X Graph Neural Networks (GNNs) have shown great promise in tasks like node and graph classification, but they often struggle to generalize, particularly to unseen or out-of-distribution (OOD) data. These challenges are exacerbated when training data is limited in size or diversity. To address these issues, we introduce a theoretical framework using Rademacher complexity to compute a regret bound on the generalization error and then characterize the effect of data augmentation. This framework informs the design of GRATIN, an efficient graph data augmentation algorithm leveraging the capability of Gaussian Mixture Models (GMMs) to approximate any distribution. Our approach not only outperforms existing augmentation techniques in terms of generalization but also offers improved time complexity, making it highly suitable for real-world applications.

APA

Abbahaddou, Y., Malliaros, F.D., Lutzeyer, J.F., Aboussalah, A.M. & Vazirgiannis, M.. (2025). Graph Neural Network Generalization With Gaussian Mixture Model Based Augmentation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:102-125 Available from https://proceedings.mlr.press/v267/abbahaddou25a.html.

Graph Neural Network Generalization With Gaussian Mixture Model Based Augmentation

Abstract

Cite this Paper

Related Material