PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning

Jaejun Lee, Minsung Hwang, Joyce Jiyoung Whang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:26589-26620, 2024.

Abstract

While a number of knowledge graph representation learning (KGRL) methods have been proposed over the past decade, very few theoretical analyses have been conducted on them. In this paper, we present the first PAC-Bayesian generalization bounds for KGRL methods. To analyze a broad class of KGRL models, we propose a generic framework named ReED (Relation-aware Encoder-Decoder), which consists of a relation-aware message passing encoder and a triplet classification decoder. Our ReED framework can express at least 15 different existing KGRL models, including not only graph neural network-based models such as R-GCN and CompGCN but also shallow-architecture models such as RotatE and ANALOGY. Our generalization bounds for the ReED framework provide theoretical grounds for the commonly used tricks in KGRL, e.g., parameter-sharing and weight normalization schemes, and guide desirable design choices for practical KGRL methods. We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-lee24i, title = {{PAC}-{B}ayesian Generalization Bounds for Knowledge Graph Representation Learning}, author = {Lee, Jaejun and Hwang, Minsung and Whang, Joyce Jiyoung}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {26589--26620}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lee24i/lee24i.pdf}, url = {https://proceedings.mlr.press/v235/lee24i.html}, abstract = {While a number of knowledge graph representation learning (KGRL) methods have been proposed over the past decade, very few theoretical analyses have been conducted on them. In this paper, we present the first PAC-Bayesian generalization bounds for KGRL methods. To analyze a broad class of KGRL models, we propose a generic framework named ReED (Relation-aware Encoder-Decoder), which consists of a relation-aware message passing encoder and a triplet classification decoder. Our ReED framework can express at least 15 different existing KGRL models, including not only graph neural network-based models such as R-GCN and CompGCN but also shallow-architecture models such as RotatE and ANALOGY. Our generalization bounds for the ReED framework provide theoretical grounds for the commonly used tricks in KGRL, e.g., parameter-sharing and weight normalization schemes, and guide desirable design choices for practical KGRL methods. We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs.} }
Endnote
%0 Conference Paper %T PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning %A Jaejun Lee %A Minsung Hwang %A Joyce Jiyoung Whang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-lee24i %I PMLR %P 26589--26620 %U https://proceedings.mlr.press/v235/lee24i.html %V 235 %X While a number of knowledge graph representation learning (KGRL) methods have been proposed over the past decade, very few theoretical analyses have been conducted on them. In this paper, we present the first PAC-Bayesian generalization bounds for KGRL methods. To analyze a broad class of KGRL models, we propose a generic framework named ReED (Relation-aware Encoder-Decoder), which consists of a relation-aware message passing encoder and a triplet classification decoder. Our ReED framework can express at least 15 different existing KGRL models, including not only graph neural network-based models such as R-GCN and CompGCN but also shallow-architecture models such as RotatE and ANALOGY. Our generalization bounds for the ReED framework provide theoretical grounds for the commonly used tricks in KGRL, e.g., parameter-sharing and weight normalization schemes, and guide desirable design choices for practical KGRL methods. We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs.
APA
Lee, J., Hwang, M. & Whang, J.J.. (2024). PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:26589-26620 Available from https://proceedings.mlr.press/v235/lee24i.html.

Related Material