[edit]
Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae
Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning, PMLR 284:891-912, 2025.
Abstract
We present a method that uses graph neural networks (GNNs) to predict and interpret digenic deletion fitness in the yeast Saccharomyces cerevisiae from a knowledge graph (KG) with ontology-based box embeddings. We construct the KG from community databases using terms defined in several ontologies. From the class hierarchies in the ontologies, box embeddings are learnt as low dimensional representations of the nodes in the graph, which are used together with GNNs to predict cell growth for digenic deletions from the KG. With this we show that high level qualitative information can be used to predict experimental data. Prediction performance was improved when using box embeddings of ontologies to represent the nodes in the graph, compared to learning features specific for this task. This suggests that class hierarchies in ontologies contain useful information about the domains, which can be extracted in the training of the box embeddings. We also demonstrate that our model can generalise beyond the task it was trained for by evaluating it on higher order gene deletions. Additionally, we apply model interpretability techniques to identify co-occurring edges critical for fitness. Our findings are further validated by a biological experiment that reveals an association between inositol utilization and osmotic stress resistance, emphasising the model’s potential to guide scientific discovery.