Popularity Agnostic Evaluation of Knowledge Graph Embeddings

Aisha Mohamed, Shameem Parambath, Zoi Kaoudi, Ashraf Aboulnaga
Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR 124:1059-1068, 2020.

Abstract

In this paper, we show that the distribution of entities and relations in common knowledge graphs is highly skewed, with some entities and relations being much more popular than the rest. We show that while knowledge graph embedding models give state-of-the-art performance in many relational learning tasks such as link prediction, current evaluation metrics like hits@k and mrr are biased towards popular entities and relations. We propose two new evaluation metrics, strat-hits@k and strat-mrr, which are unbiased estimators of the true hits@k and mrr when the items follow a power-law distribution. Our new metrics are generalizations of hits@k and mrr that take into account the popularity of the entities and relations in the data, with a tuning parameter determining how much emphasis the metric places on popular vs. unpopular items. Using our metrics, we run experiments on benchmark datasets to show that the performance of embedding models degrades as the popularity of the entities and relations decreases, and that current reported results overestimate the performance of these models by magnifying their accuracy on popular items.

Cite this Paper


BibTeX
@InProceedings{pmlr-v124-mohamed20a, title = {Popularity Agnostic Evaluation of Knowledge Graph Embeddings}, author = {Mohamed, Aisha and Parambath, Shameem and Kaoudi, Zoi and Aboulnaga, Ashraf}, booktitle = {Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)}, pages = {1059--1068}, year = {2020}, editor = {Peters, Jonas and Sontag, David}, volume = {124}, series = {Proceedings of Machine Learning Research}, month = {03--06 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v124/mohamed20a/mohamed20a.pdf}, url = {https://proceedings.mlr.press/v124/mohamed20a.html}, abstract = {In this paper, we show that the distribution of entities and relations in common knowledge graphs is highly skewed, with some entities and relations being much more popular than the rest. We show that while knowledge graph embedding models give state-of-the-art performance in many relational learning tasks such as link prediction, current evaluation metrics like hits@k and mrr are biased towards popular entities and relations. We propose two new evaluation metrics, strat-hits@k and strat-mrr, which are unbiased estimators of the true hits@k and mrr when the items follow a power-law distribution. Our new metrics are generalizations of hits@k and mrr that take into account the popularity of the entities and relations in the data, with a tuning parameter determining how much emphasis the metric places on popular vs. unpopular items. Using our metrics, we run experiments on benchmark datasets to show that the performance of embedding models degrades as the popularity of the entities and relations decreases, and that current reported results overestimate the performance of these models by magnifying their accuracy on popular items.} }
Endnote
%0 Conference Paper %T Popularity Agnostic Evaluation of Knowledge Graph Embeddings %A Aisha Mohamed %A Shameem Parambath %A Zoi Kaoudi %A Ashraf Aboulnaga %B Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI) %C Proceedings of Machine Learning Research %D 2020 %E Jonas Peters %E David Sontag %F pmlr-v124-mohamed20a %I PMLR %P 1059--1068 %U https://proceedings.mlr.press/v124/mohamed20a.html %V 124 %X In this paper, we show that the distribution of entities and relations in common knowledge graphs is highly skewed, with some entities and relations being much more popular than the rest. We show that while knowledge graph embedding models give state-of-the-art performance in many relational learning tasks such as link prediction, current evaluation metrics like hits@k and mrr are biased towards popular entities and relations. We propose two new evaluation metrics, strat-hits@k and strat-mrr, which are unbiased estimators of the true hits@k and mrr when the items follow a power-law distribution. Our new metrics are generalizations of hits@k and mrr that take into account the popularity of the entities and relations in the data, with a tuning parameter determining how much emphasis the metric places on popular vs. unpopular items. Using our metrics, we run experiments on benchmark datasets to show that the performance of embedding models degrades as the popularity of the entities and relations decreases, and that current reported results overestimate the performance of these models by magnifying their accuracy on popular items.
APA
Mohamed, A., Parambath, S., Kaoudi, Z. & Aboulnaga, A.. (2020). Popularity Agnostic Evaluation of Knowledge Graph Embeddings. Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), in Proceedings of Machine Learning Research 124:1059-1068 Available from https://proceedings.mlr.press/v124/mohamed20a.html.

Related Material