Confidence-Aware Contrastive Distillation for Test-time Prompt Tuning

Min Wang, Qing Cheng
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:660-666, 2025.

Abstract

Pre-trained vision-language models like CLIP have shown strong performance on various visual recognition tasks but often suffer from poor generalization under distribution shifts. Test-Time Prompt Tuning (TPT) is a promising solution that adapts prompt embeddings during inference using entropy minimization on unlabeled test data, while keeping the vision and text encoders frozen. However, entropy-based tuning lacks structural regularization and can lead to overconfident misclassifications. In this paper, we introduce Confidence-Aware Contrastive Distillation (CaCoD), a lightweight and effective approach to improve the robustness and calibration of TPT. Our method leverages the confidence structure of test-time predictions by identifying high- and low-confidence samples, and aligning their feature representations through a contrastive distillation loss. This encourages semantically meaningful updates to the prompt embeddings without requiring labels or retraining. Experiments across 11 fine-grained datasets demonstrate that CaCoD consistently reduces calibration error and improves predictive reliability, while maintaining strong accuracy. Our approach is model-agnostic and easily pluggable into existing TPT pipelines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v278-wang25g, title = {Confidence-Aware Contrastive Distillation for Test-time Prompt Tuning}, author = {Wang, Min and Cheng, Qing}, booktitle = {Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing}, pages = {660--666}, year = {2025}, editor = {Zeng, Nianyin and Pachori, Ram Bilas and Wang, Dongshu}, volume = {278}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v278/main/assets/wang25g/wang25g.pdf}, url = {https://proceedings.mlr.press/v278/wang25g.html}, abstract = {Pre-trained vision-language models like CLIP have shown strong performance on various visual recognition tasks but often suffer from poor generalization under distribution shifts. Test-Time Prompt Tuning (TPT) is a promising solution that adapts prompt embeddings during inference using entropy minimization on unlabeled test data, while keeping the vision and text encoders frozen. However, entropy-based tuning lacks structural regularization and can lead to overconfident misclassifications. In this paper, we introduce Confidence-Aware Contrastive Distillation (CaCoD), a lightweight and effective approach to improve the robustness and calibration of TPT. Our method leverages the confidence structure of test-time predictions by identifying high- and low-confidence samples, and aligning their feature representations through a contrastive distillation loss. This encourages semantically meaningful updates to the prompt embeddings without requiring labels or retraining. Experiments across 11 fine-grained datasets demonstrate that CaCoD consistently reduces calibration error and improves predictive reliability, while maintaining strong accuracy. Our approach is model-agnostic and easily pluggable into existing TPT pipelines.} }
Endnote
%0 Conference Paper %T Confidence-Aware Contrastive Distillation for Test-time Prompt Tuning %A Min Wang %A Qing Cheng %B Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2025 %E Nianyin Zeng %E Ram Bilas Pachori %E Dongshu Wang %F pmlr-v278-wang25g %I PMLR %P 660--666 %U https://proceedings.mlr.press/v278/wang25g.html %V 278 %X Pre-trained vision-language models like CLIP have shown strong performance on various visual recognition tasks but often suffer from poor generalization under distribution shifts. Test-Time Prompt Tuning (TPT) is a promising solution that adapts prompt embeddings during inference using entropy minimization on unlabeled test data, while keeping the vision and text encoders frozen. However, entropy-based tuning lacks structural regularization and can lead to overconfident misclassifications. In this paper, we introduce Confidence-Aware Contrastive Distillation (CaCoD), a lightweight and effective approach to improve the robustness and calibration of TPT. Our method leverages the confidence structure of test-time predictions by identifying high- and low-confidence samples, and aligning their feature representations through a contrastive distillation loss. This encourages semantically meaningful updates to the prompt embeddings without requiring labels or retraining. Experiments across 11 fine-grained datasets demonstrate that CaCoD consistently reduces calibration error and improves predictive reliability, while maintaining strong accuracy. Our approach is model-agnostic and easily pluggable into existing TPT pipelines.
APA
Wang, M. & Cheng, Q.. (2025). Confidence-Aware Contrastive Distillation for Test-time Prompt Tuning. Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 278:660-666 Available from https://proceedings.mlr.press/v278/wang25g.html.

Related Material