On the Importance of Embedding Norms in Self-Supervised Learning

Andrew Draganov, Sharvaree Vadgama, Sebastian Damrich, Jan Niklas Böhm, Lucas Maes, Dmitry Kobak, Erik J Bekkers
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:14417-14438, 2025.

Abstract

Self-supervised learning (SSL) allows training data representations without a supervised signal and has become an important paradigm in machine learning. Most SSL methods employ the cosine similarity between embedding vectors and hence effectively embed data on a hypersphere. While this seemingly implies that embedding norms cannot play any role in SSL, a few recent works have suggested that embedding norms have properties related to network convergence and confidence. In this paper, we resolve this apparent contradiction and systematically establish the embedding norm’s role in SSL training. Using theoretical analysis, simulations, and experiments, we show that embedding norms (i) govern SSL convergence rates and (ii) encode network confidence, with smaller norms corresponding to unexpected samples. Additionally, we show that manipulating embedding norms can have large effects on convergence speed. Our findings demonstrate that SSL embedding norms are integral to understanding and optimizing network behavior.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-draganov25a, title = {On the Importance of Embedding Norms in Self-Supervised Learning}, author = {Draganov, Andrew and Vadgama, Sharvaree and Damrich, Sebastian and B\"{o}hm, Jan Niklas and Maes, Lucas and Kobak, Dmitry and Bekkers, Erik J}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {14417--14438}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/draganov25a/draganov25a.pdf}, url = {https://proceedings.mlr.press/v267/draganov25a.html}, abstract = {Self-supervised learning (SSL) allows training data representations without a supervised signal and has become an important paradigm in machine learning. Most SSL methods employ the cosine similarity between embedding vectors and hence effectively embed data on a hypersphere. While this seemingly implies that embedding norms cannot play any role in SSL, a few recent works have suggested that embedding norms have properties related to network convergence and confidence. In this paper, we resolve this apparent contradiction and systematically establish the embedding norm’s role in SSL training. Using theoretical analysis, simulations, and experiments, we show that embedding norms (i) govern SSL convergence rates and (ii) encode network confidence, with smaller norms corresponding to unexpected samples. Additionally, we show that manipulating embedding norms can have large effects on convergence speed. Our findings demonstrate that SSL embedding norms are integral to understanding and optimizing network behavior.} }
Endnote
%0 Conference Paper %T On the Importance of Embedding Norms in Self-Supervised Learning %A Andrew Draganov %A Sharvaree Vadgama %A Sebastian Damrich %A Jan Niklas Böhm %A Lucas Maes %A Dmitry Kobak %A Erik J Bekkers %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-draganov25a %I PMLR %P 14417--14438 %U https://proceedings.mlr.press/v267/draganov25a.html %V 267 %X Self-supervised learning (SSL) allows training data representations without a supervised signal and has become an important paradigm in machine learning. Most SSL methods employ the cosine similarity between embedding vectors and hence effectively embed data on a hypersphere. While this seemingly implies that embedding norms cannot play any role in SSL, a few recent works have suggested that embedding norms have properties related to network convergence and confidence. In this paper, we resolve this apparent contradiction and systematically establish the embedding norm’s role in SSL training. Using theoretical analysis, simulations, and experiments, we show that embedding norms (i) govern SSL convergence rates and (ii) encode network confidence, with smaller norms corresponding to unexpected samples. Additionally, we show that manipulating embedding norms can have large effects on convergence speed. Our findings demonstrate that SSL embedding norms are integral to understanding and optimizing network behavior.
APA
Draganov, A., Vadgama, S., Damrich, S., Böhm, J.N., Maes, L., Kobak, D. & Bekkers, E.J.. (2025). On the Importance of Embedding Norms in Self-Supervised Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:14417-14438 Available from https://proceedings.mlr.press/v267/draganov25a.html.

Related Material