Open Biomedical Network Benchmark: A Python Toolkit for Benchmarking Datasets with Biomedical Networks

Renming Liu; Arjun Krishnan

Open Biomedical Network Benchmark: A Python Toolkit for Benchmarking Datasets with Biomedical Networks

Renming Liu, Arjun Krishnan

Proceedings of the 18th Machine Learning in Computational Biology meeting, PMLR 240:23-59, 2024.

Abstract

Over the past decades, network biology has been a major driver of computational methods developed to better understand the functional roles of each gene in the human genome in their cellular context. Following the application of traditional semi-supervised and supervised machine learning (ML) techniques, the next wave of advances in network biology will come from leveraging graph neural networks (GNN). However, to test new GNN-based approaches, a systematic and comprehensive benchmarking resource that spans a diverse selection of biomedical networks and gene classification tasks is lacking. Here, we present the Open Biomedical Network Benchmark (OBNB), a collection of node-classification benchmarking datasets derived using networks from 15 sources and tasks that include predicting genes associated with a wide range of functions, traits, and diseases. The accompanying Python package, obnb, contains reusable modules that enable researchers to download source data from public databases or archived versions and set up ML-ready datasets that are compatible with popular GNN frameworks such as PyG and DGL. Our work lays the foundation for novel GNN applications in network biology. obob will also help network biologists easily set-up custom benchmarking datasets for answering new questions of interest and collaboratively engage with graph ML practitioners to enhance our understanding of the human genome. OBNB is released under the MIT license and is freely available on GitHub: https://github.com/krishnanlab/obnb

Cite this Paper

BibTeX


@InProceedings{pmlr-v240-liu24a,
  title = 	 {Open Biomedical Network Benchmark: A Python Toolkit for Benchmarking Datasets with Biomedical Networks},
  author =       {Liu, Renming and Krishnan, Arjun},
  booktitle = 	 {Proceedings of the 18th Machine Learning in Computational Biology meeting},
  pages = 	 {23--59},
  year = 	 {2024},
  editor = 	 {Knowles, David A. and Mostafavi, Sara},
  volume = 	 {240},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Nov--01 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v240/liu24a/liu24a.pdf},
  url = 	 {https://proceedings.mlr.press/v240/liu24a.html},
  abstract = 	 {Over the past decades, network biology has been a major driver of computational methods developed to better understand the functional roles of each gene in the human genome in their cellular context. Following the application of traditional semi-supervised and supervised machine learning (ML) techniques, the next wave of advances in network biology will come from leveraging graph neural networks (GNN). However, to test new GNN-based approaches, a systematic and comprehensive benchmarking resource that spans a diverse selection of biomedical networks and gene classification tasks is lacking. Here, we present the Open Biomedical Network Benchmark (OBNB), a collection of node-classification benchmarking datasets derived using networks from 15 sources and tasks that include predicting genes associated with a wide range of functions, traits, and diseases. The accompanying Python package, obnb, contains reusable modules that enable researchers to download source data from public databases or archived versions and set up ML-ready datasets that are compatible with popular GNN frameworks such as PyG and DGL. Our work lays the foundation for novel GNN applications in network biology. obob will also help network biologists easily set-up custom benchmarking datasets for answering new questions of interest and collaboratively engage with graph ML practitioners to enhance our understanding of the human genome. OBNB is released under the MIT license and is freely available on GitHub: https://github.com/krishnanlab/obnb}
}

Endnote

%0 Conference Paper
%T Open Biomedical Network Benchmark: A Python Toolkit for Benchmarking Datasets with Biomedical Networks
%A Renming Liu
%A Arjun Krishnan
%B Proceedings of the 18th Machine Learning in Computational Biology meeting
%C Proceedings of Machine Learning Research
%D 2024
%E David A. Knowles
%E Sara Mostafavi	
%F pmlr-v240-liu24a
%I PMLR
%P 23--59
%U https://proceedings.mlr.press/v240/liu24a.html
%V 240
%X Over the past decades, network biology has been a major driver of computational methods developed to better understand the functional roles of each gene in the human genome in their cellular context. Following the application of traditional semi-supervised and supervised machine learning (ML) techniques, the next wave of advances in network biology will come from leveraging graph neural networks (GNN). However, to test new GNN-based approaches, a systematic and comprehensive benchmarking resource that spans a diverse selection of biomedical networks and gene classification tasks is lacking. Here, we present the Open Biomedical Network Benchmark (OBNB), a collection of node-classification benchmarking datasets derived using networks from 15 sources and tasks that include predicting genes associated with a wide range of functions, traits, and diseases. The accompanying Python package, obnb, contains reusable modules that enable researchers to download source data from public databases or archived versions and set up ML-ready datasets that are compatible with popular GNN frameworks such as PyG and DGL. Our work lays the foundation for novel GNN applications in network biology. obob will also help network biologists easily set-up custom benchmarking datasets for answering new questions of interest and collaboratively engage with graph ML practitioners to enhance our understanding of the human genome. OBNB is released under the MIT license and is freely available on GitHub: https://github.com/krishnanlab/obnb

APA


Liu, R. & Krishnan, A.. (2024). Open Biomedical Network Benchmark: A Python Toolkit for Benchmarking Datasets with Biomedical Networks. Proceedings of the 18th Machine Learning in Computational Biology meeting, in Proceedings of Machine Learning Research 240:23-59 Available from https://proceedings.mlr.press/v240/liu24a.html.

Related Material

Download PDF