GEMCONT: Genetics-based Multimodal Contrastive Learning Enhances Phenotypic embeddings and Boosts Genetic Discovery

Daniel Sens, Liubov Shilova, Adrian V. Dalca, Julia A. Schnabel, Francesco Paolo Casale
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:3443-3463, 2026.

Abstract

Genetic variation provides stable, time-invariant markers of disease risk and can therefore reveal upstream mechanisms underlying complex traits. Genome-wide association studies (GWAS) have identified thousands of loci associated with disease, yet most remain difficult to interpret because the intermediate phenotypes linking genotype to disease are unknown. Here, we address the question whether disease-associated genetic loci can be directly used to extract such risk-related features from quantitative phenotypes, including functional tests and medical imaging. We introduce GEMCONT (GEnetics-based Multimodal CONTrastive Learning), a multimodal contrastive learning framework that aligns genotype and phenotype representations in a shared latent space. Unlike task-agnostic multimodal pretraining, GEMCONT is disease-conditioned: GWAS-informed variant panels act as targeted supervision to learn risk-relevant imaging embeddings. To reflect the weak, additive nature of genetic effects, it employs a linear genetic encoder alongside a deep phenotypic encoder. We validate GEMCONT in controlled simulations and apply it to two real-world settings: spirometry curves for asthma and retinal fundus images for glaucoma. In both, GEMCONT improves disease risk prediction and enhances recovery of genetic associations compared with standard unsupervised or polygenic risk–based models. Altogether, our results demonstrate that incorporating stable genetic supervision into multimodal representation learning enables the extraction of genetically informed risk traits, refining disease phenotypes and improving the interpretability of association studies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v315-sens26a, title = {GEMCONT: Genetics-based Multimodal Contrastive Learning Enhances Phenotypic embeddings and Boosts Genetic Discovery}, author = {Sens, Daniel and Shilova, Liubov and Dalca, Adrian V. and Schnabel, Julia A. and Casale, Francesco Paolo}, booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning}, pages = {3443--3463}, year = {2026}, editor = {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining}, volume = {315}, series = {Proceedings of Machine Learning Research}, month = {08--10 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v315/main/assets/sens26a/sens26a.pdf}, url = {https://proceedings.mlr.press/v315/sens26a.html}, abstract = {Genetic variation provides stable, time-invariant markers of disease risk and can therefore reveal upstream mechanisms underlying complex traits. Genome-wide association studies (GWAS) have identified thousands of loci associated with disease, yet most remain difficult to interpret because the intermediate phenotypes linking genotype to disease are unknown. Here, we address the question whether disease-associated genetic loci can be directly used to extract such risk-related features from quantitative phenotypes, including functional tests and medical imaging. We introduce GEMCONT (GEnetics-based Multimodal CONTrastive Learning), a multimodal contrastive learning framework that aligns genotype and phenotype representations in a shared latent space. Unlike task-agnostic multimodal pretraining, GEMCONT is disease-conditioned: GWAS-informed variant panels act as targeted supervision to learn risk-relevant imaging embeddings. To reflect the weak, additive nature of genetic effects, it employs a linear genetic encoder alongside a deep phenotypic encoder. We validate GEMCONT in controlled simulations and apply it to two real-world settings: spirometry curves for asthma and retinal fundus images for glaucoma. In both, GEMCONT improves disease risk prediction and enhances recovery of genetic associations compared with standard unsupervised or polygenic risk–based models. Altogether, our results demonstrate that incorporating stable genetic supervision into multimodal representation learning enables the extraction of genetically informed risk traits, refining disease phenotypes and improving the interpretability of association studies.} }
Endnote
%0 Conference Paper %T GEMCONT: Genetics-based Multimodal Contrastive Learning Enhances Phenotypic embeddings and Boosts Genetic Discovery %A Daniel Sens %A Liubov Shilova %A Adrian V. Dalca %A Julia A. Schnabel %A Francesco Paolo Casale %B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Yuankai Huo %E Mingchen Gao %E Chang-Fu Kuo %E Yueming Jin %E Ruining Deng %F pmlr-v315-sens26a %I PMLR %P 3443--3463 %U https://proceedings.mlr.press/v315/sens26a.html %V 315 %X Genetic variation provides stable, time-invariant markers of disease risk and can therefore reveal upstream mechanisms underlying complex traits. Genome-wide association studies (GWAS) have identified thousands of loci associated with disease, yet most remain difficult to interpret because the intermediate phenotypes linking genotype to disease are unknown. Here, we address the question whether disease-associated genetic loci can be directly used to extract such risk-related features from quantitative phenotypes, including functional tests and medical imaging. We introduce GEMCONT (GEnetics-based Multimodal CONTrastive Learning), a multimodal contrastive learning framework that aligns genotype and phenotype representations in a shared latent space. Unlike task-agnostic multimodal pretraining, GEMCONT is disease-conditioned: GWAS-informed variant panels act as targeted supervision to learn risk-relevant imaging embeddings. To reflect the weak, additive nature of genetic effects, it employs a linear genetic encoder alongside a deep phenotypic encoder. We validate GEMCONT in controlled simulations and apply it to two real-world settings: spirometry curves for asthma and retinal fundus images for glaucoma. In both, GEMCONT improves disease risk prediction and enhances recovery of genetic associations compared with standard unsupervised or polygenic risk–based models. Altogether, our results demonstrate that incorporating stable genetic supervision into multimodal representation learning enables the extraction of genetically informed risk traits, refining disease phenotypes and improving the interpretability of association studies.
APA
Sens, D., Shilova, L., Dalca, A.V., Schnabel, J.A. & Casale, F.P.. (2026). GEMCONT: Genetics-based Multimodal Contrastive Learning Enhances Phenotypic embeddings and Boosts Genetic Discovery. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:3443-3463 Available from https://proceedings.mlr.press/v315/sens26a.html.

Related Material