Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae

Filip Kronström, Daniel Brunnsåker, Ievgeniia A. Tiukova, Ross D. King
Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning, PMLR 284:891-912, 2025.

Abstract

We present a method that uses graph neural networks (GNNs) to predict and interpret digenic deletion fitness in the yeast Saccharomyces cerevisiae from a knowledge graph (KG) with ontology-based box embeddings. We construct the KG from community databases using terms defined in several ontologies. From the class hierarchies in the ontologies, box embeddings are learnt as low dimensional representations of the nodes in the graph, which are used together with GNNs to predict cell growth for digenic deletions from the KG. With this we show that high level qualitative information can be used to predict experimental data. Prediction performance was improved when using box embeddings of ontologies to represent the nodes in the graph, compared to learning features specific for this task. This suggests that class hierarchies in ontologies contain useful information about the domains, which can be extracted in the training of the box embeddings. We also demonstrate that our model can generalise beyond the task it was trained for by evaluating it on higher order gene deletions. Additionally, we apply model interpretability techniques to identify co-occurring edges critical for fitness. Our findings are further validated by a biological experiment that reveals an association between inositol utilization and osmotic stress resistance, emphasising the model’s potential to guide scientific discovery.

Cite this Paper


BibTeX
@InProceedings{pmlr-v284-kronstrom25a, title = {Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae}, author = {Kronstr\"{o}m, Filip and Brunns{\aa}ker, Daniel and Tiukova, Ievgeniia A. and King, Ross D.}, booktitle = {Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning}, pages = {891--912}, year = {2025}, editor = {H. Gilpin, Leilani and Giunchiglia, Eleonora and Hitzler, Pascal and van Krieken, Emile}, volume = {284}, series = {Proceedings of Machine Learning Research}, month = {08--10 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v284/main/assets/kronstrom25a/kronstrom25a.pdf}, url = {https://proceedings.mlr.press/v284/kronstrom25a.html}, abstract = {We present a method that uses graph neural networks (GNNs) to predict and interpret digenic deletion fitness in the yeast Saccharomyces cerevisiae from a knowledge graph (KG) with ontology-based box embeddings. We construct the KG from community databases using terms defined in several ontologies. From the class hierarchies in the ontologies, box embeddings are learnt as low dimensional representations of the nodes in the graph, which are used together with GNNs to predict cell growth for digenic deletions from the KG. With this we show that high level qualitative information can be used to predict experimental data. Prediction performance was improved when using box embeddings of ontologies to represent the nodes in the graph, compared to learning features specific for this task. This suggests that class hierarchies in ontologies contain useful information about the domains, which can be extracted in the training of the box embeddings. We also demonstrate that our model can generalise beyond the task it was trained for by evaluating it on higher order gene deletions. Additionally, we apply model interpretability techniques to identify co-occurring edges critical for fitness. Our findings are further validated by a biological experiment that reveals an association between inositol utilization and osmotic stress resistance, emphasising the model’s potential to guide scientific discovery.} }
Endnote
%0 Conference Paper %T Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae %A Filip Kronström %A Daniel Brunnsåker %A Ievgeniia A. Tiukova %A Ross D. King %B Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning %C Proceedings of Machine Learning Research %D 2025 %E Leilani H. Gilpin %E Eleonora Giunchiglia %E Pascal Hitzler %E Emile van Krieken %F pmlr-v284-kronstrom25a %I PMLR %P 891--912 %U https://proceedings.mlr.press/v284/kronstrom25a.html %V 284 %X We present a method that uses graph neural networks (GNNs) to predict and interpret digenic deletion fitness in the yeast Saccharomyces cerevisiae from a knowledge graph (KG) with ontology-based box embeddings. We construct the KG from community databases using terms defined in several ontologies. From the class hierarchies in the ontologies, box embeddings are learnt as low dimensional representations of the nodes in the graph, which are used together with GNNs to predict cell growth for digenic deletions from the KG. With this we show that high level qualitative information can be used to predict experimental data. Prediction performance was improved when using box embeddings of ontologies to represent the nodes in the graph, compared to learning features specific for this task. This suggests that class hierarchies in ontologies contain useful information about the domains, which can be extracted in the training of the box embeddings. We also demonstrate that our model can generalise beyond the task it was trained for by evaluating it on higher order gene deletions. Additionally, we apply model interpretability techniques to identify co-occurring edges critical for fitness. Our findings are further validated by a biological experiment that reveals an association between inositol utilization and osmotic stress resistance, emphasising the model’s potential to guide scientific discovery.
APA
Kronström, F., Brunnsåker, D., Tiukova, I.A. & King, R.D.. (2025). Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae. Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning, in Proceedings of Machine Learning Research 284:891-912 Available from https://proceedings.mlr.press/v284/kronstrom25a.html.

Related Material