Investigating the Role of Negatives in Contrastive Representation Learning

Jordan Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:7187-7209, 2022.

Abstract

Noise contrastive learning is a popular technique for unsupervised representation learning. In this approach, a representation is obtained via reduction to supervised learning, where given a notion of semantic similarity, the learner tries to distinguish a similar (positive) example from a collection of random (negative) examples. The success of modern contrastive learning pipelines relies on many design decisions, such as the choice of data augmentation, the number of negative examples, and the batch size; however, there is limited understanding as to how these parameters interact and affect downstream performance. We focus on disambiguating the role of one of these parameters: the number of negative examples. Theoretically, we show the existence of a collision-coverage trade-off suggesting that the optimal number of negative examples should scale with the number of underlying concepts in the data. Empirically, we scrutinize the role of the number of negatives in both NLP and vision tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-ash22a, title = { Investigating the Role of Negatives in Contrastive Representation Learning }, author = {Ash, Jordan and Goel, Surbhi and Krishnamurthy, Akshay and Misra, Dipendra}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {7187--7209}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/ash22a/ash22a.pdf}, url = {https://proceedings.mlr.press/v151/ash22a.html}, abstract = { Noise contrastive learning is a popular technique for unsupervised representation learning. In this approach, a representation is obtained via reduction to supervised learning, where given a notion of semantic similarity, the learner tries to distinguish a similar (positive) example from a collection of random (negative) examples. The success of modern contrastive learning pipelines relies on many design decisions, such as the choice of data augmentation, the number of negative examples, and the batch size; however, there is limited understanding as to how these parameters interact and affect downstream performance. We focus on disambiguating the role of one of these parameters: the number of negative examples. Theoretically, we show the existence of a collision-coverage trade-off suggesting that the optimal number of negative examples should scale with the number of underlying concepts in the data. Empirically, we scrutinize the role of the number of negatives in both NLP and vision tasks. } }
Endnote
%0 Conference Paper %T Investigating the Role of Negatives in Contrastive Representation Learning %A Jordan Ash %A Surbhi Goel %A Akshay Krishnamurthy %A Dipendra Misra %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-ash22a %I PMLR %P 7187--7209 %U https://proceedings.mlr.press/v151/ash22a.html %V 151 %X Noise contrastive learning is a popular technique for unsupervised representation learning. In this approach, a representation is obtained via reduction to supervised learning, where given a notion of semantic similarity, the learner tries to distinguish a similar (positive) example from a collection of random (negative) examples. The success of modern contrastive learning pipelines relies on many design decisions, such as the choice of data augmentation, the number of negative examples, and the batch size; however, there is limited understanding as to how these parameters interact and affect downstream performance. We focus on disambiguating the role of one of these parameters: the number of negative examples. Theoretically, we show the existence of a collision-coverage trade-off suggesting that the optimal number of negative examples should scale with the number of underlying concepts in the data. Empirically, we scrutinize the role of the number of negatives in both NLP and vision tasks.
APA
Ash, J., Goel, S., Krishnamurthy, A. & Misra, D.. (2022). Investigating the Role of Negatives in Contrastive Representation Learning . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:7187-7209 Available from https://proceedings.mlr.press/v151/ash22a.html.

Related Material