Generalization Analysis for Contrastive Representation Learning

Yunwen Lei, Tianbao Yang, Yiming Ying, Ding-Xuan Zhou
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:19200-19227, 2023.

Abstract

Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-lei23a, title = {Generalization Analysis for Contrastive Representation Learning}, author = {Lei, Yunwen and Yang, Tianbao and Ying, Yiming and Zhou, Ding-Xuan}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {19200--19227}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/lei23a/lei23a.pdf}, url = {https://proceedings.mlr.press/v202/lei23a.html}, abstract = {Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.} }
Endnote
%0 Conference Paper %T Generalization Analysis for Contrastive Representation Learning %A Yunwen Lei %A Tianbao Yang %A Yiming Ying %A Ding-Xuan Zhou %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-lei23a %I PMLR %P 19200--19227 %U https://proceedings.mlr.press/v202/lei23a.html %V 202 %X Recently, contrastive learning has found impressive success in advancing the state of the art in solving various machine learning tasks. However, the existing generalization analysis is very limited or even not meaningful. In particular, the existing generalization error bounds depend linearly on the number $k$ of negative examples while it was widely shown in practice that choosing a large $k$ is necessary to guarantee good generalization of contrastive learning in downstream tasks. In this paper, we establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms. Our analysis uses structural results on empirical covering numbers and Rademacher complexities to exploit the Lipschitz continuity of loss functions. For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition. We apply our results to learning with both linear representation and nonlinear representation by deep neural networks, for both of which we derive Rademacher complexity bounds to get improved generalization bounds.
APA
Lei, Y., Yang, T., Ying, Y. & Zhou, D.. (2023). Generalization Analysis for Contrastive Representation Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:19200-19227 Available from https://proceedings.mlr.press/v202/lei23a.html.

Related Material