Integrating Prior Knowledge in Contrastive Learning with Kernel

Benoit Dufumier, Carlo Alberto Barbano, Robin Louiset, Edouard Duchesnay, Pietro Gori
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:8851-8878, 2023.

Abstract

Data augmentation is a crucial component in unsupervised contrastive learning (CL). It determines how positive samples are defined and, ultimately, the quality of the learned representation. In this work, we open the door to new perspectives for CL by integrating prior knowledge, given either by generative models - viewed as prior representations - or weak attributes in the positive and negative sampling. To this end, we use kernel theory to propose a novel loss, called decoupled uniformity, that i) allows the integration of prior knowledge and ii) removes the positive-negative coupling in the original InfoNCE loss. We draw a connection between contrastive learning and the conditional mean embedding theory to derive tight bounds on the downstream classification loss. In an unsupervised setting, we empirically demonstrate that CL benefits from generative models to improve its representation both on natural and medical images. In a weakly supervised scenario, our framework outperforms other unconditional and conditional CL approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-dufumier23a, title = {Integrating Prior Knowledge in Contrastive Learning with Kernel}, author = {Dufumier, Benoit and Barbano, Carlo Alberto and Louiset, Robin and Duchesnay, Edouard and Gori, Pietro}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {8851--8878}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/dufumier23a/dufumier23a.pdf}, url = {https://proceedings.mlr.press/v202/dufumier23a.html}, abstract = {Data augmentation is a crucial component in unsupervised contrastive learning (CL). It determines how positive samples are defined and, ultimately, the quality of the learned representation. In this work, we open the door to new perspectives for CL by integrating prior knowledge, given either by generative models - viewed as prior representations - or weak attributes in the positive and negative sampling. To this end, we use kernel theory to propose a novel loss, called decoupled uniformity, that i) allows the integration of prior knowledge and ii) removes the positive-negative coupling in the original InfoNCE loss. We draw a connection between contrastive learning and the conditional mean embedding theory to derive tight bounds on the downstream classification loss. In an unsupervised setting, we empirically demonstrate that CL benefits from generative models to improve its representation both on natural and medical images. In a weakly supervised scenario, our framework outperforms other unconditional and conditional CL approaches.} }
Endnote
%0 Conference Paper %T Integrating Prior Knowledge in Contrastive Learning with Kernel %A Benoit Dufumier %A Carlo Alberto Barbano %A Robin Louiset %A Edouard Duchesnay %A Pietro Gori %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-dufumier23a %I PMLR %P 8851--8878 %U https://proceedings.mlr.press/v202/dufumier23a.html %V 202 %X Data augmentation is a crucial component in unsupervised contrastive learning (CL). It determines how positive samples are defined and, ultimately, the quality of the learned representation. In this work, we open the door to new perspectives for CL by integrating prior knowledge, given either by generative models - viewed as prior representations - or weak attributes in the positive and negative sampling. To this end, we use kernel theory to propose a novel loss, called decoupled uniformity, that i) allows the integration of prior knowledge and ii) removes the positive-negative coupling in the original InfoNCE loss. We draw a connection between contrastive learning and the conditional mean embedding theory to derive tight bounds on the downstream classification loss. In an unsupervised setting, we empirically demonstrate that CL benefits from generative models to improve its representation both on natural and medical images. In a weakly supervised scenario, our framework outperforms other unconditional and conditional CL approaches.
APA
Dufumier, B., Barbano, C.A., Louiset, R., Duchesnay, E. & Gori, P.. (2023). Integrating Prior Knowledge in Contrastive Learning with Kernel. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:8851-8878 Available from https://proceedings.mlr.press/v202/dufumier23a.html.

Related Material