On the Stepwise Nature of Self-Supervised Learning

James B Simon, Maksis Knutins, Liu Ziyin, Daniel Geisz, Abraham J Fetterman, Joshua Albrecht
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:31852-31876, 2023.

Abstract

We present a simple picture of the training process of self-supervised learning methods with dual deep networks. In our picture, these methods learn their high-dimensional embeddings one dimension at a time in a sequence of discrete, well-separated steps. We arrive at this picture via the study of a linear toy model of Barlow Twins, applicable to the case in which the trained network is infinitely wide. We solve the training dynamics of our toy model from small initialization, finding that the model learns the top eigenmodes of a certain contrastive kernel in a discrete, stepwise fashion, and find a closed-form expression for the final learned representations. Remarkably, we see the same stepwise learning phenomenon when training deep ResNets using the Barlow Twins, SimCLR, and VICReg losses. This stepwise picture partially demystifies the process of self-supervised training.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-simon23a, title = {On the Stepwise Nature of Self-Supervised Learning}, author = {Simon, James B and Knutins, Maksis and Ziyin, Liu and Geisz, Daniel and Fetterman, Abraham J and Albrecht, Joshua}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {31852--31876}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/simon23a/simon23a.pdf}, url = {https://proceedings.mlr.press/v202/simon23a.html}, abstract = {We present a simple picture of the training process of self-supervised learning methods with dual deep networks. In our picture, these methods learn their high-dimensional embeddings one dimension at a time in a sequence of discrete, well-separated steps. We arrive at this picture via the study of a linear toy model of Barlow Twins, applicable to the case in which the trained network is infinitely wide. We solve the training dynamics of our toy model from small initialization, finding that the model learns the top eigenmodes of a certain contrastive kernel in a discrete, stepwise fashion, and find a closed-form expression for the final learned representations. Remarkably, we see the same stepwise learning phenomenon when training deep ResNets using the Barlow Twins, SimCLR, and VICReg losses. This stepwise picture partially demystifies the process of self-supervised training.} }
Endnote
%0 Conference Paper %T On the Stepwise Nature of Self-Supervised Learning %A James B Simon %A Maksis Knutins %A Liu Ziyin %A Daniel Geisz %A Abraham J Fetterman %A Joshua Albrecht %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-simon23a %I PMLR %P 31852--31876 %U https://proceedings.mlr.press/v202/simon23a.html %V 202 %X We present a simple picture of the training process of self-supervised learning methods with dual deep networks. In our picture, these methods learn their high-dimensional embeddings one dimension at a time in a sequence of discrete, well-separated steps. We arrive at this picture via the study of a linear toy model of Barlow Twins, applicable to the case in which the trained network is infinitely wide. We solve the training dynamics of our toy model from small initialization, finding that the model learns the top eigenmodes of a certain contrastive kernel in a discrete, stepwise fashion, and find a closed-form expression for the final learned representations. Remarkably, we see the same stepwise learning phenomenon when training deep ResNets using the Barlow Twins, SimCLR, and VICReg losses. This stepwise picture partially demystifies the process of self-supervised training.
APA
Simon, J.B., Knutins, M., Ziyin, L., Geisz, D., Fetterman, A.J. & Albrecht, J.. (2023). On the Stepwise Nature of Self-Supervised Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:31852-31876 Available from https://proceedings.mlr.press/v202/simon23a.html.

Related Material