Toward Data-centric Directed Graph Learning: An Entropy-driven Approach

Xunkai Li, Zhengyu Wu, Kaichi Yu, Hongchao Qin, Guang Zeng, Rong-Hua Li, Guoren Wang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:36310-36339, 2025.

Abstract

Although directed graphs (digraphs) offer strong modeling capabilities for complex topological systems, existing DiGraph Neural Networks (DiGNNs) struggle to fully capture the concealed rich structural information. This data-level limitation results in model-level sub-optimal predictive performance and underscores the necessity of further exploring the potential correlations between the directed edges (topology) and node profiles (features and labels) from a data-centric perspective, thereby empowering model-centric neural networks with stronger encoding capabilities. In this paper, we propose Entropy-driven Digraph knowlEdge distillatioN (EDEN), which can serve as a data-centric digraph learning paradigm or a model-agnostic hot-and-plug data-centric Knowledge Distillation (KD) module. EDEN implements data-centric machine learning by constructing a coarse-grained Hierarchical Knowledge Tree (HKT) using proposed hierarchical encoding theory, and refining HKT through mutual information analysis of node profiles to guide knowledge distillation during training. As a general framework, EDEN naturally extends to undirected graphs and consistently delivers strong performance. Extensive experiments on 14 (di)graph datasets—spanning both homophily and heterophily settings—and across four downstream tasks show that EDEN achieves SOTA results and significantly enhances existing (Di)GNNs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-li25cy, title = {Toward Data-centric Directed Graph Learning: An Entropy-driven Approach}, author = {Li, Xunkai and Wu, Zhengyu and Yu, Kaichi and Qin, Hongchao and Zeng, Guang and Li, Rong-Hua and Wang, Guoren}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {36310--36339}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25cy/li25cy.pdf}, url = {https://proceedings.mlr.press/v267/li25cy.html}, abstract = {Although directed graphs (digraphs) offer strong modeling capabilities for complex topological systems, existing DiGraph Neural Networks (DiGNNs) struggle to fully capture the concealed rich structural information. This data-level limitation results in model-level sub-optimal predictive performance and underscores the necessity of further exploring the potential correlations between the directed edges (topology) and node profiles (features and labels) from a data-centric perspective, thereby empowering model-centric neural networks with stronger encoding capabilities. In this paper, we propose Entropy-driven Digraph knowlEdge distillatioN (EDEN), which can serve as a data-centric digraph learning paradigm or a model-agnostic hot-and-plug data-centric Knowledge Distillation (KD) module. EDEN implements data-centric machine learning by constructing a coarse-grained Hierarchical Knowledge Tree (HKT) using proposed hierarchical encoding theory, and refining HKT through mutual information analysis of node profiles to guide knowledge distillation during training. As a general framework, EDEN naturally extends to undirected graphs and consistently delivers strong performance. Extensive experiments on 14 (di)graph datasets—spanning both homophily and heterophily settings—and across four downstream tasks show that EDEN achieves SOTA results and significantly enhances existing (Di)GNNs.} }
Endnote
%0 Conference Paper %T Toward Data-centric Directed Graph Learning: An Entropy-driven Approach %A Xunkai Li %A Zhengyu Wu %A Kaichi Yu %A Hongchao Qin %A Guang Zeng %A Rong-Hua Li %A Guoren Wang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-li25cy %I PMLR %P 36310--36339 %U https://proceedings.mlr.press/v267/li25cy.html %V 267 %X Although directed graphs (digraphs) offer strong modeling capabilities for complex topological systems, existing DiGraph Neural Networks (DiGNNs) struggle to fully capture the concealed rich structural information. This data-level limitation results in model-level sub-optimal predictive performance and underscores the necessity of further exploring the potential correlations between the directed edges (topology) and node profiles (features and labels) from a data-centric perspective, thereby empowering model-centric neural networks with stronger encoding capabilities. In this paper, we propose Entropy-driven Digraph knowlEdge distillatioN (EDEN), which can serve as a data-centric digraph learning paradigm or a model-agnostic hot-and-plug data-centric Knowledge Distillation (KD) module. EDEN implements data-centric machine learning by constructing a coarse-grained Hierarchical Knowledge Tree (HKT) using proposed hierarchical encoding theory, and refining HKT through mutual information analysis of node profiles to guide knowledge distillation during training. As a general framework, EDEN naturally extends to undirected graphs and consistently delivers strong performance. Extensive experiments on 14 (di)graph datasets—spanning both homophily and heterophily settings—and across four downstream tasks show that EDEN achieves SOTA results and significantly enhances existing (Di)GNNs.
APA
Li, X., Wu, Z., Yu, K., Qin, H., Zeng, G., Li, R. & Wang, G.. (2025). Toward Data-centric Directed Graph Learning: An Entropy-driven Approach. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:36310-36339 Available from https://proceedings.mlr.press/v267/li25cy.html.

Related Material