The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Borja Rodrı́guez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:29143-29160, 2023.

Abstract

The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-rodri-guez-galvez23a, title = {The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning}, author = {Rodr\'{\i}guez G\'{a}lvez, Borja and Blaas, Arno and Rodriguez, Pau and Golinski, Adam and Suau, Xavier and Ramapuram, Jason and Busbridge, Dan and Zappella, Luca}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {29143--29160}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/rodri-guez-galvez23a/rodri-guez-galvez23a.pdf}, url = {https://proceedings.mlr.press/v202/rodri-guez-galvez23a.html}, abstract = {The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients.} }
Endnote
%0 Conference Paper %T The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning %A Borja Rodrı́guez Gálvez %A Arno Blaas %A Pau Rodriguez %A Adam Golinski %A Xavier Suau %A Jason Ramapuram %A Dan Busbridge %A Luca Zappella %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-rodri-guez-galvez23a %I PMLR %P 29143--29160 %U https://proceedings.mlr.press/v202/rodri-guez-galvez23a.html %V 202 %X The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients.
APA
Rodrı́guez Gálvez, B., Blaas, A., Rodriguez, P., Golinski, A., Suau, X., Ramapuram, J., Busbridge, D. & Zappella, L.. (2023). The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:29143-29160 Available from https://proceedings.mlr.press/v202/rodri-guez-galvez23a.html.

Related Material