Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks

Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael M. Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, Christopher Morris
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:81067-81089, 2025.

Abstract

While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-bechler-speicher25a, title = {Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks}, author = {Bechler-Speicher, Maya and Finkelshtein, Ben and Frasca, Fabrizio and M\"{u}ller, Luis and T\"{o}nshoff, Jan and Siraudin, Antoine and Zaverkin, Viktor and Bronstein, Michael M. and Niepert, Mathias and Perozzi, Bryan and Galkin, Mikhail and Morris, Christopher}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {81067--81089}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/bechler-speicher25a/bechler-speicher25a.pdf}, url = {https://proceedings.mlr.press/v267/bechler-speicher25a.html}, abstract = {While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.} }
Endnote
%0 Conference Paper %T Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks %A Maya Bechler-Speicher %A Ben Finkelshtein %A Fabrizio Frasca %A Luis Müller %A Jan Tönshoff %A Antoine Siraudin %A Viktor Zaverkin %A Michael M. Bronstein %A Mathias Niepert %A Bryan Perozzi %A Mikhail Galkin %A Christopher Morris %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-bechler-speicher25a %I PMLR %P 81067--81089 %U https://proceedings.mlr.press/v267/bechler-speicher25a.html %V 267 %X While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.
APA
Bechler-Speicher, M., Finkelshtein, B., Frasca, F., Müller, L., Tönshoff, J., Siraudin, A., Zaverkin, V., Bronstein, M.M., Niepert, M., Perozzi, B., Galkin, M. & Morris, C.. (2025). Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:81067-81089 Available from https://proceedings.mlr.press/v267/bechler-speicher25a.html.

Related Material