Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

Hidetaka Kamigaito, Katsuhiko Hayashi
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:10661-10675, 2022.

Abstract

Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities. However, the performance of KGE degrades without hyperparameters such as the margin term and number of negative samples in NS loss being appropriately selected. Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning. Our theoretical analysis showed that scoring methods with restricted value ranges, such as TransE and RotatE, require appropriate adjustment of the margin term or the number of negative samples different from those without restricted value ranges, such as RESCAL, ComplEx, and DistMult. We also propose subsampling methods specialized for the NS loss in KGE studied from a theoretical aspect. Our empirical analysis on the FB15k-237, WN18RR, and YAGO3-10 datasets showed that the results of actually trained models agree with our theoretical findings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-kamigaito22a, title = {Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning}, author = {Kamigaito, Hidetaka and Hayashi, Katsuhiko}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {10661--10675}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/kamigaito22a/kamigaito22a.pdf}, url = {https://proceedings.mlr.press/v162/kamigaito22a.html}, abstract = {Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities. However, the performance of KGE degrades without hyperparameters such as the margin term and number of negative samples in NS loss being appropriately selected. Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning. Our theoretical analysis showed that scoring methods with restricted value ranges, such as TransE and RotatE, require appropriate adjustment of the margin term or the number of negative samples different from those without restricted value ranges, such as RESCAL, ComplEx, and DistMult. We also propose subsampling methods specialized for the NS loss in KGE studied from a theoretical aspect. Our empirical analysis on the FB15k-237, WN18RR, and YAGO3-10 datasets showed that the results of actually trained models agree with our theoretical findings.} }
Endnote
%0 Conference Paper %T Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning %A Hidetaka Kamigaito %A Katsuhiko Hayashi %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-kamigaito22a %I PMLR %P 10661--10675 %U https://proceedings.mlr.press/v162/kamigaito22a.html %V 162 %X Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities. However, the performance of KGE degrades without hyperparameters such as the margin term and number of negative samples in NS loss being appropriately selected. Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning. Our theoretical analysis showed that scoring methods with restricted value ranges, such as TransE and RotatE, require appropriate adjustment of the margin term or the number of negative samples different from those without restricted value ranges, such as RESCAL, ComplEx, and DistMult. We also propose subsampling methods specialized for the NS loss in KGE studied from a theoretical aspect. Our empirical analysis on the FB15k-237, WN18RR, and YAGO3-10 datasets showed that the results of actually trained models agree with our theoretical findings.
APA
Kamigaito, H. & Hayashi, K.. (2022). Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:10661-10675 Available from https://proceedings.mlr.press/v162/kamigaito22a.html.

Related Material