[edit]
Stable Prediction on Graphs with Agnostic Distribution Shifts
Proceedings of The KDD'23 Workshop on Causal Discovery, Prediction and Decision, PMLR 218:49-74, 2023.
Abstract
Most graph neural networks (GNNs) are proposed and evaluated under independent and identically distributed (IID) training and testing data. In real-world applications, however, agnostic distribution shifts from training to testing naturally exist, leading to unstable prediction of traditional GNNs. To bridge the gap, we pursue stable prediction on graphs, i.e., to achieve high average performance and low performance variance (stability) across non-IID testing graphs. The key to stable prediction lies in capturing stable properties that are resilient to distribution shifts. In this light, we aim to identify neighbor nodes (properties) in neighborhood aggregation that are consistently important for prediction under heterogeneous distribution shifts. To achieve this target, we propose a model-agnostic stable learning framework for GNNs. The framework performs biased selection on the observed training graph, resulting in multiple non-IID graph subsets. We train one weight predictor per subset to measure the importance of properties under a particular distribution shift, and multiple predictors could tell the properties that are consistently important. An important property should contribute to high average performance and also stability (low performance variance) across non-IID subsets. In this regard, in training importance predictors, we introduce a globally stable regularizer to reduce the variance of training losses across non-IID graph datasets. Based on the importance weights of properties across non- IID subsets, a locally stable regularizer down-weights unstable properties in prediction. We conduct extensive experiments on several graph benchmarks and a noisy industrial recommendation dataset where distribution shifts exist. The results demonstrate that our method outperforms various state-of-the-art GNNs for stable prediction on graphs with agnostic distribution shifts.