Learning Representations without Compositional Assumptions

Tennison Liu, Jeroen Berrevoets, Zhaozhi Qian, Mihaela Van Der Schaar
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:21388-21403, 2023.

Abstract

This paper addresses unsupervised representation learning on tabular data containing multiple views generated by distinct sources of measurement. Traditional methods, which tackle this problem using the multi-view framework, are constrained by predefined assumptions that assume feature sets share the same information and representations should learn globally shared factors. However, this assumption is not always valid for real-world tabular datasets with complex dependencies between feature sets, resulting in localized information that is harder to learn. To overcome this limitation, we propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges. Furthermore, we introduce $\texttt{LEGATO}$, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically. This approach results in latent graph components that specialize in capturing localized information from different regions of the input, leading to superior downstream performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-liu23c, title = {Learning Representations without Compositional Assumptions}, author = {Liu, Tennison and Berrevoets, Jeroen and Qian, Zhaozhi and Van Der Schaar, Mihaela}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {21388--21403}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/liu23c/liu23c.pdf}, url = {https://proceedings.mlr.press/v202/liu23c.html}, abstract = {This paper addresses unsupervised representation learning on tabular data containing multiple views generated by distinct sources of measurement. Traditional methods, which tackle this problem using the multi-view framework, are constrained by predefined assumptions that assume feature sets share the same information and representations should learn globally shared factors. However, this assumption is not always valid for real-world tabular datasets with complex dependencies between feature sets, resulting in localized information that is harder to learn. To overcome this limitation, we propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges. Furthermore, we introduce $\texttt{LEGATO}$, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically. This approach results in latent graph components that specialize in capturing localized information from different regions of the input, leading to superior downstream performance.} }
Endnote
%0 Conference Paper %T Learning Representations without Compositional Assumptions %A Tennison Liu %A Jeroen Berrevoets %A Zhaozhi Qian %A Mihaela Van Der Schaar %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-liu23c %I PMLR %P 21388--21403 %U https://proceedings.mlr.press/v202/liu23c.html %V 202 %X This paper addresses unsupervised representation learning on tabular data containing multiple views generated by distinct sources of measurement. Traditional methods, which tackle this problem using the multi-view framework, are constrained by predefined assumptions that assume feature sets share the same information and representations should learn globally shared factors. However, this assumption is not always valid for real-world tabular datasets with complex dependencies between feature sets, resulting in localized information that is harder to learn. To overcome this limitation, we propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges. Furthermore, we introduce $\texttt{LEGATO}$, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically. This approach results in latent graph components that specialize in capturing localized information from different regions of the input, leading to superior downstream performance.
APA
Liu, T., Berrevoets, J., Qian, Z. & Van Der Schaar, M.. (2023). Learning Representations without Compositional Assumptions. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:21388-21403 Available from https://proceedings.mlr.press/v202/liu23c.html.

Related Material