Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization

Aseem Baranwal, Kimon Fountoulakis, Aukosh Jagannath
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:684-693, 2021.

Abstract

Recently there has been increased interest in semi-supervised classification in the presence of graphical information. A new class of learning models has emerged that relies, at its most basic level, on classifying the data after first applying a graph convolution. To understand the merits of this approach, we study the classification of a mixture of Gaussians, where the data corresponds to the node attributes of a stochastic block model. We show that graph convolution extends the regime in which the data is linearly separable by a factor of roughly $1/\sqrt{D}$, where $D$ is the expected degree of a node, as compared to the mixture model data on its own. Furthermore, we find that the linear classifier obtained by minimizing the cross-entropy loss after the graph convolution generalizes to out-of-distribution data where the unseen data can have different intra- and inter-class edge probabilities from the training data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-baranwal21a, title = {Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization}, author = {Baranwal, Aseem and Fountoulakis, Kimon and Jagannath, Aukosh}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {684--693}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/baranwal21a/baranwal21a.pdf}, url = {http://proceedings.mlr.press/v139/baranwal21a.html}, abstract = {Recently there has been increased interest in semi-supervised classification in the presence of graphical information. A new class of learning models has emerged that relies, at its most basic level, on classifying the data after first applying a graph convolution. To understand the merits of this approach, we study the classification of a mixture of Gaussians, where the data corresponds to the node attributes of a stochastic block model. We show that graph convolution extends the regime in which the data is linearly separable by a factor of roughly $1/\sqrt{D}$, where $D$ is the expected degree of a node, as compared to the mixture model data on its own. Furthermore, we find that the linear classifier obtained by minimizing the cross-entropy loss after the graph convolution generalizes to out-of-distribution data where the unseen data can have different intra- and inter-class edge probabilities from the training data.} }
Endnote
%0 Conference Paper %T Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization %A Aseem Baranwal %A Kimon Fountoulakis %A Aukosh Jagannath %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-baranwal21a %I PMLR %P 684--693 %U http://proceedings.mlr.press/v139/baranwal21a.html %V 139 %X Recently there has been increased interest in semi-supervised classification in the presence of graphical information. A new class of learning models has emerged that relies, at its most basic level, on classifying the data after first applying a graph convolution. To understand the merits of this approach, we study the classification of a mixture of Gaussians, where the data corresponds to the node attributes of a stochastic block model. We show that graph convolution extends the regime in which the data is linearly separable by a factor of roughly $1/\sqrt{D}$, where $D$ is the expected degree of a node, as compared to the mixture model data on its own. Furthermore, we find that the linear classifier obtained by minimizing the cross-entropy loss after the graph convolution generalizes to out-of-distribution data where the unseen data can have different intra- and inter-class edge probabilities from the training data.
APA
Baranwal, A., Fountoulakis, K. & Jagannath, A.. (2021). Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:684-693 Available from http://proceedings.mlr.press/v139/baranwal21a.html.

Related Material