Simultaneous Similarity-based Self-Distillation for Deep Metric Learning

Karsten Roth, Timo Milbich, Bjorn Ommer, Joseph Paul Cohen, Marzyeh Ghassemi
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9095-9106, 2021.

Abstract

Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot retrieval applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives. However, generalization capacity is known to scale with the embedding space dimensionality. Unfortunately, high dimensional embeddings also create higher retrieval cost for downstream applications. To remedy this, we propose S2SD - Simultaneous Similarity-based Self-distillation. S2SD extends DML with knowledge distillation from auxiliary, high-dimensional embedding and feature spaces to leverage complementary context during training while retaining test-time cost and with negligible changes to the training time. Experiments and ablations across different objectives and standard benchmarks show S2SD offering highly significant improvements of up to 7% in Recall@1, while also setting a new state-of-the-art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-roth21a, title = {Simultaneous Similarity-based Self-Distillation for Deep Metric Learning}, author = {Roth, Karsten and Milbich, Timo and Ommer, Bjorn and Cohen, Joseph Paul and Ghassemi, Marzyeh}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {9095--9106}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/roth21a/roth21a.pdf}, url = {https://proceedings.mlr.press/v139/roth21a.html}, abstract = {Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot retrieval applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives. However, generalization capacity is known to scale with the embedding space dimensionality. Unfortunately, high dimensional embeddings also create higher retrieval cost for downstream applications. To remedy this, we propose S2SD - Simultaneous Similarity-based Self-distillation. S2SD extends DML with knowledge distillation from auxiliary, high-dimensional embedding and feature spaces to leverage complementary context during training while retaining test-time cost and with negligible changes to the training time. Experiments and ablations across different objectives and standard benchmarks show S2SD offering highly significant improvements of up to 7% in Recall@1, while also setting a new state-of-the-art.} }
Endnote
%0 Conference Paper %T Simultaneous Similarity-based Self-Distillation for Deep Metric Learning %A Karsten Roth %A Timo Milbich %A Bjorn Ommer %A Joseph Paul Cohen %A Marzyeh Ghassemi %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-roth21a %I PMLR %P 9095--9106 %U https://proceedings.mlr.press/v139/roth21a.html %V 139 %X Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot retrieval applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives. However, generalization capacity is known to scale with the embedding space dimensionality. Unfortunately, high dimensional embeddings also create higher retrieval cost for downstream applications. To remedy this, we propose S2SD - Simultaneous Similarity-based Self-distillation. S2SD extends DML with knowledge distillation from auxiliary, high-dimensional embedding and feature spaces to leverage complementary context during training while retaining test-time cost and with negligible changes to the training time. Experiments and ablations across different objectives and standard benchmarks show S2SD offering highly significant improvements of up to 7% in Recall@1, while also setting a new state-of-the-art.
APA
Roth, K., Milbich, T., Ommer, B., Cohen, J.P. & Ghassemi, M.. (2021). Simultaneous Similarity-based Self-Distillation for Deep Metric Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:9095-9106 Available from https://proceedings.mlr.press/v139/roth21a.html.

Related Material