Large-Margin Contrastive Learning with Distance Polarization Regularizer

Shuo Chen, Gang Niu, Chen Gong, Jun Li, Jian Yang, Masashi Sugiyama
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1673-1683, 2021.

Abstract

\emph{Contrastive learning} (CL) pretrains models in a pairwise manner, where given a data point, other data points are all regarded as dissimilar, including some that are \emph{semantically} similar. The issue has been addressed by properly weighting similar and dissimilar pairs as in \emph{positive-unlabeled learning}, so that the objective of CL is \emph{unbiased} and CL is \emph{consistent}. However, in this paper, we argue that this great solution is still not enough: its weighted objective \emph{hides} the issue where the semantically similar pairs are still pushed away; as CL is pretraining, this phenomenon is not our desideratum and might affect downstream tasks. To this end, we propose \emph{large-margin contrastive learning} (LMCL) with \emph{distance polarization regularizer}, motivated by the distribution characteristic of pairwise distances in \emph{metric learning}. In LMCL, we can distinguish between \emph{intra-cluster} and \emph{inter-cluster} pairs, and then only push away inter-cluster pairs, which \emph{solves} the above issue explicitly. Theoretically, we prove a tighter error bound for LMCL; empirically, the superiority of LMCL is demonstrated across multiple domains, \emph{i.e.}, image classification, sentence representation, and reinforcement learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-chen21n, title = {Large-Margin Contrastive Learning with Distance Polarization Regularizer}, author = {Chen, Shuo and Niu, Gang and Gong, Chen and Li, Jun and Yang, Jian and Sugiyama, Masashi}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {1673--1683}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/chen21n/chen21n.pdf}, url = {https://proceedings.mlr.press/v139/chen21n.html}, abstract = {\emph{Contrastive learning} (CL) pretrains models in a pairwise manner, where given a data point, other data points are all regarded as dissimilar, including some that are \emph{semantically} similar. The issue has been addressed by properly weighting similar and dissimilar pairs as in \emph{positive-unlabeled learning}, so that the objective of CL is \emph{unbiased} and CL is \emph{consistent}. However, in this paper, we argue that this great solution is still not enough: its weighted objective \emph{hides} the issue where the semantically similar pairs are still pushed away; as CL is pretraining, this phenomenon is not our desideratum and might affect downstream tasks. To this end, we propose \emph{large-margin contrastive learning} (LMCL) with \emph{distance polarization regularizer}, motivated by the distribution characteristic of pairwise distances in \emph{metric learning}. In LMCL, we can distinguish between \emph{intra-cluster} and \emph{inter-cluster} pairs, and then only push away inter-cluster pairs, which \emph{solves} the above issue explicitly. Theoretically, we prove a tighter error bound for LMCL; empirically, the superiority of LMCL is demonstrated across multiple domains, \emph{i.e.}, image classification, sentence representation, and reinforcement learning.} }
Endnote
%0 Conference Paper %T Large-Margin Contrastive Learning with Distance Polarization Regularizer %A Shuo Chen %A Gang Niu %A Chen Gong %A Jun Li %A Jian Yang %A Masashi Sugiyama %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-chen21n %I PMLR %P 1673--1683 %U https://proceedings.mlr.press/v139/chen21n.html %V 139 %X \emph{Contrastive learning} (CL) pretrains models in a pairwise manner, where given a data point, other data points are all regarded as dissimilar, including some that are \emph{semantically} similar. The issue has been addressed by properly weighting similar and dissimilar pairs as in \emph{positive-unlabeled learning}, so that the objective of CL is \emph{unbiased} and CL is \emph{consistent}. However, in this paper, we argue that this great solution is still not enough: its weighted objective \emph{hides} the issue where the semantically similar pairs are still pushed away; as CL is pretraining, this phenomenon is not our desideratum and might affect downstream tasks. To this end, we propose \emph{large-margin contrastive learning} (LMCL) with \emph{distance polarization regularizer}, motivated by the distribution characteristic of pairwise distances in \emph{metric learning}. In LMCL, we can distinguish between \emph{intra-cluster} and \emph{inter-cluster} pairs, and then only push away inter-cluster pairs, which \emph{solves} the above issue explicitly. Theoretically, we prove a tighter error bound for LMCL; empirically, the superiority of LMCL is demonstrated across multiple domains, \emph{i.e.}, image classification, sentence representation, and reinforcement learning.
APA
Chen, S., Niu, G., Gong, C., Li, J., Yang, J. & Sugiyama, M.. (2021). Large-Margin Contrastive Learning with Distance Polarization Regularizer. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1673-1683 Available from https://proceedings.mlr.press/v139/chen21n.html.

Related Material