Improving Pre-trained Self-Supervised Embeddings Through Effective Entropy Maximization

Deep Chakraborty, Yann LeCun, Tim G. J. Rudner, Erik Learned-Miller
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:433-441, 2025.

Abstract

A number of different architectures and loss functions have been applied to the problem of self-supervised learning (SSL), with the goal of developing embeddings that provide the best possible pre-training for as-yet-unknown, lightly supervised downstream tasks. One of these SSL criteria is to maximize the entropy of a set of embeddings in some compact space. But the goal of maximizing the embedding entropy often depends—whether explicitly or implicitly—upon high dimensional entropy estimates, which typically perform poorly in more than a few dimensions. In this paper, we motivate an effective entropy maximization criterion (E2MC), defined in terms of easy-to-estimate, low-dimensional constraints. We demonstrate that using it to continue training an already-trained SSL model for only a handful of epochs leads to a consistent and, in some cases, significant improvement in downstream performance. We perform careful ablation studies to show that the improved performance is due to the proposed add-on criterion. We also show that continued pre-training with alternative criteria does not lead to notable improvements, and in some cases, even degrades performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-chakraborty25a, title = {Improving Pre-trained Self-Supervised Embeddings Through Effective Entropy Maximization}, author = {Chakraborty, Deep and LeCun, Yann and Rudner, Tim G. J. and Learned-Miller, Erik}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {433--441}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/chakraborty25a/chakraborty25a.pdf}, url = {https://proceedings.mlr.press/v258/chakraborty25a.html}, abstract = {A number of different architectures and loss functions have been applied to the problem of self-supervised learning (SSL), with the goal of developing embeddings that provide the best possible pre-training for as-yet-unknown, lightly supervised downstream tasks. One of these SSL criteria is to maximize the entropy of a set of embeddings in some compact space. But the goal of maximizing the embedding entropy often depends—whether explicitly or implicitly—upon high dimensional entropy estimates, which typically perform poorly in more than a few dimensions. In this paper, we motivate an effective entropy maximization criterion (E2MC), defined in terms of easy-to-estimate, low-dimensional constraints. We demonstrate that using it to continue training an already-trained SSL model for only a handful of epochs leads to a consistent and, in some cases, significant improvement in downstream performance. We perform careful ablation studies to show that the improved performance is due to the proposed add-on criterion. We also show that continued pre-training with alternative criteria does not lead to notable improvements, and in some cases, even degrades performance.} }
Endnote
%0 Conference Paper %T Improving Pre-trained Self-Supervised Embeddings Through Effective Entropy Maximization %A Deep Chakraborty %A Yann LeCun %A Tim G. J. Rudner %A Erik Learned-Miller %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-chakraborty25a %I PMLR %P 433--441 %U https://proceedings.mlr.press/v258/chakraborty25a.html %V 258 %X A number of different architectures and loss functions have been applied to the problem of self-supervised learning (SSL), with the goal of developing embeddings that provide the best possible pre-training for as-yet-unknown, lightly supervised downstream tasks. One of these SSL criteria is to maximize the entropy of a set of embeddings in some compact space. But the goal of maximizing the embedding entropy often depends—whether explicitly or implicitly—upon high dimensional entropy estimates, which typically perform poorly in more than a few dimensions. In this paper, we motivate an effective entropy maximization criterion (E2MC), defined in terms of easy-to-estimate, low-dimensional constraints. We demonstrate that using it to continue training an already-trained SSL model for only a handful of epochs leads to a consistent and, in some cases, significant improvement in downstream performance. We perform careful ablation studies to show that the improved performance is due to the proposed add-on criterion. We also show that continued pre-training with alternative criteria does not lead to notable improvements, and in some cases, even degrades performance.
APA
Chakraborty, D., LeCun, Y., Rudner, T.G.J. & Learned-Miller, E.. (2025). Improving Pre-trained Self-Supervised Embeddings Through Effective Entropy Maximization. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:433-441 Available from https://proceedings.mlr.press/v258/chakraborty25a.html.

Related Material