Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features

Daeho Um, Sunoh Kim, Jiwoong Park, Jongin Lim, Seong Jin Ahn, Seulki Park
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:60530-60560, 2025.

Abstract

In this paper, we address learning tasks on graphs with missing features, enhancing the applicability of graph neural networks to real-world graph-structured data. We identify a critical limitation of existing imputation methods based on feature propagation: they produce channels with nearly identical values within each channel, and these low-variance channels contribute very little to performance in graph learning tasks. To overcome this issue, we introduce synthetic features that target the root cause of low-variance channel production, thereby increasing variance in these channels. By preventing propagation-based imputation methods from generating meaningless feature values shared across all nodes, our synthetic feature propagation scheme mitigates significant performance degradation, even under extreme missing rates. Extensive experiments demonstrate the effectiveness of our approach across various graph learning tasks with missing features, ranging from low to extremely high missing rates. Additionally, we provide both empirical evidence and theoretical proof to validate the low-variance problem. The source code is available at https://github.com/daehoum1/fisf.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-um25a, title = {Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features}, author = {Um, Daeho and Kim, Sunoh and Park, Jiwoong and Lim, Jongin and Ahn, Seong Jin and Park, Seulki}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {60530--60560}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/um25a/um25a.pdf}, url = {https://proceedings.mlr.press/v267/um25a.html}, abstract = {In this paper, we address learning tasks on graphs with missing features, enhancing the applicability of graph neural networks to real-world graph-structured data. We identify a critical limitation of existing imputation methods based on feature propagation: they produce channels with nearly identical values within each channel, and these low-variance channels contribute very little to performance in graph learning tasks. To overcome this issue, we introduce synthetic features that target the root cause of low-variance channel production, thereby increasing variance in these channels. By preventing propagation-based imputation methods from generating meaningless feature values shared across all nodes, our synthetic feature propagation scheme mitigates significant performance degradation, even under extreme missing rates. Extensive experiments demonstrate the effectiveness of our approach across various graph learning tasks with missing features, ranging from low to extremely high missing rates. Additionally, we provide both empirical evidence and theoretical proof to validate the low-variance problem. The source code is available at https://github.com/daehoum1/fisf.} }
Endnote
%0 Conference Paper %T Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features %A Daeho Um %A Sunoh Kim %A Jiwoong Park %A Jongin Lim %A Seong Jin Ahn %A Seulki Park %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-um25a %I PMLR %P 60530--60560 %U https://proceedings.mlr.press/v267/um25a.html %V 267 %X In this paper, we address learning tasks on graphs with missing features, enhancing the applicability of graph neural networks to real-world graph-structured data. We identify a critical limitation of existing imputation methods based on feature propagation: they produce channels with nearly identical values within each channel, and these low-variance channels contribute very little to performance in graph learning tasks. To overcome this issue, we introduce synthetic features that target the root cause of low-variance channel production, thereby increasing variance in these channels. By preventing propagation-based imputation methods from generating meaningless feature values shared across all nodes, our synthetic feature propagation scheme mitigates significant performance degradation, even under extreme missing rates. Extensive experiments demonstrate the effectiveness of our approach across various graph learning tasks with missing features, ranging from low to extremely high missing rates. Additionally, we provide both empirical evidence and theoretical proof to validate the low-variance problem. The source code is available at https://github.com/daehoum1/fisf.
APA
Um, D., Kim, S., Park, J., Lim, J., Ahn, S.J. & Park, S.. (2025). Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:60530-60560 Available from https://proceedings.mlr.press/v267/um25a.html.

Related Material