EMC2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

Chung-Yiu Yau, Hoi To Wai, Parameswaran Raman, Soumajyoti Sarkar, Mingyi Hong
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:56966-56981, 2024.

Abstract

A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the partition function. In this paper, we propose an E_fficient M_arkov C_hain Monte Carlo negative sampling method for C_ontrastive learning (EMC2). We follow the global contrastive learning loss as introduced in SogCLR, and propose EMC2 which utilizes an adaptive Metropolis-Hastings subroutine to generate hardness-aware negative samples in an online fashion during the optimization. We prove that EMC2 finds an O(1/T)-stationary point of the global contrastive loss in T iterations. Compared to prior works, EMC2 is the first algorithm that exhibits global convergence (to stationarity) regardless of the choice of batch size while exhibiting low computation and memory cost. Numerical experiments validate that EMC2 is effective with small batch training and achieves comparable or better performance than baseline algorithms. We report the results for pre-training image encoders on STL-10 and Imagenet-100.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-yau24a, title = {{EMC}$^2$: Efficient {MCMC} Negative Sampling for Contrastive Learning with Global Convergence}, author = {Yau, Chung-Yiu and Wai, Hoi To and Raman, Parameswaran and Sarkar, Soumajyoti and Hong, Mingyi}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {56966--56981}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/yau24a/yau24a.pdf}, url = {https://proceedings.mlr.press/v235/yau24a.html}, abstract = {A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the partition function. In this paper, we propose an $\underline{\text{E}}$fficient $\underline{\text{M}}$arkov $\underline{\text{C}}$hain Monte Carlo negative sampling method for $\underline{\text{C}}$ontrastive learning (EMC$^2$). We follow the global contrastive learning loss as introduced in SogCLR, and propose EMC$^2$ which utilizes an adaptive Metropolis-Hastings subroutine to generate hardness-aware negative samples in an online fashion during the optimization. We prove that EMC$^2$ finds an $\mathcal{O}(1/\sqrt{T})$-stationary point of the global contrastive loss in $T$ iterations. Compared to prior works, EMC$^2$ is the first algorithm that exhibits global convergence (to stationarity) regardless of the choice of batch size while exhibiting low computation and memory cost. Numerical experiments validate that EMC$^2$ is effective with small batch training and achieves comparable or better performance than baseline algorithms. We report the results for pre-training image encoders on STL-10 and Imagenet-100.} }
Endnote
%0 Conference Paper %T EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence %A Chung-Yiu Yau %A Hoi To Wai %A Parameswaran Raman %A Soumajyoti Sarkar %A Mingyi Hong %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-yau24a %I PMLR %P 56966--56981 %U https://proceedings.mlr.press/v235/yau24a.html %V 235 %X A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the partition function. In this paper, we propose an $\underline{\text{E}}$fficient $\underline{\text{M}}$arkov $\underline{\text{C}}$hain Monte Carlo negative sampling method for $\underline{\text{C}}$ontrastive learning (EMC$^2$). We follow the global contrastive learning loss as introduced in SogCLR, and propose EMC$^2$ which utilizes an adaptive Metropolis-Hastings subroutine to generate hardness-aware negative samples in an online fashion during the optimization. We prove that EMC$^2$ finds an $\mathcal{O}(1/\sqrt{T})$-stationary point of the global contrastive loss in $T$ iterations. Compared to prior works, EMC$^2$ is the first algorithm that exhibits global convergence (to stationarity) regardless of the choice of batch size while exhibiting low computation and memory cost. Numerical experiments validate that EMC$^2$ is effective with small batch training and achieves comparable or better performance than baseline algorithms. We report the results for pre-training image encoders on STL-10 and Imagenet-100.
APA
Yau, C., Wai, H.T., Raman, P., Sarkar, S. & Hong, M.. (2024). EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:56966-56981 Available from https://proceedings.mlr.press/v235/yau24a.html.

Related Material