KGEx: Explaining Knowledge Graph Embeddings via Subgraph Sampling and Knowledge Distillation

Vasileios Baltatzis, Luca Costabello
Proceedings of the Second Learning on Graphs Conference, PMLR 231:27:1-27:13, 2024.

Abstract

Despite being the go-to choice for link prediction on knowledge graphs, research on interpretability of knowledge graph embeddings (KGE) has been relatively unexplored. We present KGEx, a novel post-hoc method that explains individual link predictions by drawing inspiration from surrogate models research. Given a target triple to predict, KGEx trains surrogate KGE models that we use to identify important training triples. To gauge the impact of a training triple, we sample random portions of the target triple neighborhood and we train multiple surrogate KGE models on each of them. To ensure faithfulness, each surrogate is trained by distilling knowledge from the original KGE model. We then assess how well surrogates predict the target triple being explained, the intuition being that those leading to faithful predictions have been trained on \textasciigrave \textasciigrave impactful”neighborhood samples. Under this assumption, we then harvest triples that appear frequently across impactful neighborhoods. We conduct extensive experiments on two publicly available datasets, to demonstrate that KGEx is capable of providing explanations faithful to the black-box model.

Cite this Paper


BibTeX
@InProceedings{pmlr-v231-baltatzis24a, title = {KGEx: Explaining Knowledge Graph Embeddings via Subgraph Sampling and Knowledge Distillation}, author = {Baltatzis, Vasileios and Costabello, Luca}, booktitle = {Proceedings of the Second Learning on Graphs Conference}, pages = {27:1--27:13}, year = {2024}, editor = {Villar, Soledad and Chamberlain, Benjamin}, volume = {231}, series = {Proceedings of Machine Learning Research}, month = {27--30 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v231/baltatzis24a/baltatzis24a.pdf}, url = {https://proceedings.mlr.press/v231/baltatzis24a.html}, abstract = {Despite being the go-to choice for link prediction on knowledge graphs, research on interpretability of knowledge graph embeddings (KGE) has been relatively unexplored. We present KGEx, a novel post-hoc method that explains individual link predictions by drawing inspiration from surrogate models research. Given a target triple to predict, KGEx trains surrogate KGE models that we use to identify important training triples. To gauge the impact of a training triple, we sample random portions of the target triple neighborhood and we train multiple surrogate KGE models on each of them. To ensure faithfulness, each surrogate is trained by distilling knowledge from the original KGE model. We then assess how well surrogates predict the target triple being explained, the intuition being that those leading to faithful predictions have been trained on \textasciigrave \textasciigrave impactful”neighborhood samples. Under this assumption, we then harvest triples that appear frequently across impactful neighborhoods. We conduct extensive experiments on two publicly available datasets, to demonstrate that KGEx is capable of providing explanations faithful to the black-box model.} }
Endnote
%0 Conference Paper %T KGEx: Explaining Knowledge Graph Embeddings via Subgraph Sampling and Knowledge Distillation %A Vasileios Baltatzis %A Luca Costabello %B Proceedings of the Second Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2024 %E Soledad Villar %E Benjamin Chamberlain %F pmlr-v231-baltatzis24a %I PMLR %P 27:1--27:13 %U https://proceedings.mlr.press/v231/baltatzis24a.html %V 231 %X Despite being the go-to choice for link prediction on knowledge graphs, research on interpretability of knowledge graph embeddings (KGE) has been relatively unexplored. We present KGEx, a novel post-hoc method that explains individual link predictions by drawing inspiration from surrogate models research. Given a target triple to predict, KGEx trains surrogate KGE models that we use to identify important training triples. To gauge the impact of a training triple, we sample random portions of the target triple neighborhood and we train multiple surrogate KGE models on each of them. To ensure faithfulness, each surrogate is trained by distilling knowledge from the original KGE model. We then assess how well surrogates predict the target triple being explained, the intuition being that those leading to faithful predictions have been trained on \textasciigrave \textasciigrave impactful”neighborhood samples. Under this assumption, we then harvest triples that appear frequently across impactful neighborhoods. We conduct extensive experiments on two publicly available datasets, to demonstrate that KGEx is capable of providing explanations faithful to the black-box model.
APA
Baltatzis, V. & Costabello, L.. (2024). KGEx: Explaining Knowledge Graph Embeddings via Subgraph Sampling and Knowledge Distillation. Proceedings of the Second Learning on Graphs Conference, in Proceedings of Machine Learning Research 231:27:1-27:13 Available from https://proceedings.mlr.press/v231/baltatzis24a.html.

Related Material