CAP: Conformalized Abstention Policies for Context-Adaptive Risk Management for LLMs and VLMs

Sina Tayebati; Divake Kumar; Nastaran Darabi; Dinithi Jayasuriya; Theja Tulabandhula; Ranganath Krishnan; Amit Ranjan Trivedi

CAP: Conformalized Abstention Policies for Context-Adaptive Risk Management for LLMs and VLMs

Sina Tayebati, Divake Kumar, Nastaran Darabi, Dinithi Jayasuriya, Theja Tulabandhula, Ranganath Krishnan, Amit Ranjan Trivedi

Proceedings of the 17th Asian Conference on Machine Learning, PMLR 304:926-941, 2025.

Abstract

Large Language and Vision-Language Models (LLMs/VLMs) are increasingly deployed in high-stakes domains where predictive failures can be costly. Conformal Prediction (CP) offers distribution-free uncertainty quantification with finite-sample coverage guarantees, but its reliance on a globally fixed risk level enforces a uniform trade-off between coverage and informativeness, misaligned with the instance-specific uncertainty patterns of modern foundation models. We propose the framework of Conformalized Abstention Policy (CAP), a novel framework that integrates CP with deep Reinforcement Learning (RL) to learn per-instance abstention policies. CAP trains a utility-driven policy to dynamically select the conformal risk level for each input, balancing point prediction, set prediction, and full abstention based on downstream utility. We specifically introduce Policy-Calibrated Coverage, a theoretical guarantee ensuring that the empirical coverage of the learned policy reliably estimates its true expected performance. Extensive experiments show that CAP maintains the 90% target coverage while substantially outperforming static CP baselines: improving hallucination detection AUROC by up to 22.2%, uncertainty-guided selective generation AUARC by 21.2%, and reducing calibration error by over 70%. CAP also extends to free-form generation by managing the trade-off between a detailed and factual response on a per-instance basis by learning an optimal risk level for sub-claim retention.

Cite this Paper

BibTeX

@InProceedings{pmlr-v304-tayebati25a,
  title = 	 {CAP: Conformalized Abstention Policies for Context-Adaptive Risk Management for LLMs and VLMs},
  author =       {Tayebati, Sina and Kumar, Divake and Darabi, Nastaran and Jayasuriya, Dinithi and Tulabandhula, Theja and Krishnan, Ranganath and Trivedi, Amit Ranjan},
  booktitle = 	 {Proceedings of the 17th Asian Conference on Machine Learning},
  pages = 	 {926--941},
  year = 	 {2025},
  editor = 	 {Lee, Hung-yi and Liu, Tongliang},
  volume = 	 {304},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--12 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v304/main/assets/tayebati25a/tayebati25a.pdf},
  url = 	 {https://proceedings.mlr.press/v304/tayebati25a.html},
  abstract = 	 {Large Language and Vision-Language Models (LLMs/VLMs) are increasingly deployed in high-stakes domains where predictive failures can be costly. Conformal Prediction (CP) offers distribution-free uncertainty quantification with finite-sample coverage guarantees, but its reliance on a globally fixed risk level enforces a uniform trade-off between coverage and informativeness, misaligned with the instance-specific uncertainty patterns of modern foundation models. We propose the framework of Conformalized Abstention Policy (CAP), a novel framework that integrates CP with deep Reinforcement Learning (RL) to learn per-instance abstention policies. CAP trains a utility-driven policy to dynamically select the conformal risk level for each input, balancing point prediction, set prediction, and full abstention based on downstream utility. We specifically introduce Policy-Calibrated Coverage, a theoretical guarantee ensuring that the empirical coverage of the learned policy reliably estimates its true expected performance. Extensive experiments show that CAP maintains the 90% target coverage while substantially outperforming static CP baselines: improving hallucination detection AUROC by up to 22.2%, uncertainty-guided selective generation AUARC by 21.2%, and reducing calibration error by over 70%. CAP also extends to free-form generation by managing the trade-off between a detailed and factual response on a per-instance basis by learning an optimal risk level for sub-claim retention.}
}

Endnote

%0 Conference Paper
%T CAP: Conformalized Abstention Policies for Context-Adaptive Risk Management for LLMs and VLMs
%A Sina Tayebati
%A Divake Kumar
%A Nastaran Darabi
%A Dinithi Jayasuriya
%A Theja Tulabandhula
%A Ranganath Krishnan
%A Amit Ranjan Trivedi
%B Proceedings of the 17th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Hung-yi Lee
%E Tongliang Liu	
%F pmlr-v304-tayebati25a
%I PMLR
%P 926--941
%U https://proceedings.mlr.press/v304/tayebati25a.html
%V 304
%X Large Language and Vision-Language Models (LLMs/VLMs) are increasingly deployed in high-stakes domains where predictive failures can be costly. Conformal Prediction (CP) offers distribution-free uncertainty quantification with finite-sample coverage guarantees, but its reliance on a globally fixed risk level enforces a uniform trade-off between coverage and informativeness, misaligned with the instance-specific uncertainty patterns of modern foundation models. We propose the framework of Conformalized Abstention Policy (CAP), a novel framework that integrates CP with deep Reinforcement Learning (RL) to learn per-instance abstention policies. CAP trains a utility-driven policy to dynamically select the conformal risk level for each input, balancing point prediction, set prediction, and full abstention based on downstream utility. We specifically introduce Policy-Calibrated Coverage, a theoretical guarantee ensuring that the empirical coverage of the learned policy reliably estimates its true expected performance. Extensive experiments show that CAP maintains the 90% target coverage while substantially outperforming static CP baselines: improving hallucination detection AUROC by up to 22.2%, uncertainty-guided selective generation AUARC by 21.2%, and reducing calibration error by over 70%. CAP also extends to free-form generation by managing the trade-off between a detailed and factual response on a per-instance basis by learning an optimal risk level for sub-claim retention.

APA

Tayebati, S., Kumar, D., Darabi, N., Jayasuriya, D., Tulabandhula, T., Krishnan, R. & Trivedi, A.R.. (2025). CAP: Conformalized Abstention Policies for Context-Adaptive Risk Management for LLMs and VLMs. Proceedings of the 17th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 304:926-941 Available from https://proceedings.mlr.press/v304/tayebati25a.html.

CAP: Conformalized Abstention Policies for Context-Adaptive Risk Management for LLMs and VLMs

Abstract

Cite this Paper

Related Material