Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective

Yang Chen, Cong Fang, Zhouchen Lin, Bing Liu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:6666-6698, 2024.

Abstract

Foundation Models (FMs) have demonstrated remarkable insights into the relational dynamics of the world, leading to the crucial question: how do these models acquire an understanding of world hybrid relations? Traditional statistical learning, particularly for prediction problems, may overlook the rich and inherently structured information from the data, especially regarding the relationships between objects. We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of FMs. In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style. By integrating rich graph theories into the realm of PTMs, our mathematical framework offers powerful tools for an in-depth understanding of pre-training from a unique perspective and can be used under various scenarios. As an example, we extend the framework to entity alignment in multimodal learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-chen24l, title = {Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective}, author = {Chen, Yang and Fang, Cong and Lin, Zhouchen and Liu, Bing}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {6666--6698}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/chen24l/chen24l.pdf}, url = {https://proceedings.mlr.press/v235/chen24l.html}, abstract = {Foundation Models (FMs) have demonstrated remarkable insights into the relational dynamics of the world, leading to the crucial question: how do these models acquire an understanding of world hybrid relations? Traditional statistical learning, particularly for prediction problems, may overlook the rich and inherently structured information from the data, especially regarding the relationships between objects. We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of FMs. In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style. By integrating rich graph theories into the realm of PTMs, our mathematical framework offers powerful tools for an in-depth understanding of pre-training from a unique perspective and can be used under various scenarios. As an example, we extend the framework to entity alignment in multimodal learning.} }
Endnote
%0 Conference Paper %T Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective %A Yang Chen %A Cong Fang %A Zhouchen Lin %A Bing Liu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-chen24l %I PMLR %P 6666--6698 %U https://proceedings.mlr.press/v235/chen24l.html %V 235 %X Foundation Models (FMs) have demonstrated remarkable insights into the relational dynamics of the world, leading to the crucial question: how do these models acquire an understanding of world hybrid relations? Traditional statistical learning, particularly for prediction problems, may overlook the rich and inherently structured information from the data, especially regarding the relationships between objects. We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of FMs. In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style. By integrating rich graph theories into the realm of PTMs, our mathematical framework offers powerful tools for an in-depth understanding of pre-training from a unique perspective and can be used under various scenarios. As an example, we extend the framework to entity alignment in multimodal learning.
APA
Chen, Y., Fang, C., Lin, Z. & Liu, B.. (2024). Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:6666-6698 Available from https://proceedings.mlr.press/v235/chen24l.html.

Related Material