Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression

Hooman Shahrokhi, Devjeet Raj Roy, Yan Yan, Venera Arnaoudova, Jana Doppa
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:3718-3748, 2025.

Abstract

We consider the problem of generating valid and small prediction sets by sampling outputs (e.g., software code and natural language text) from a black-box deep generative model for a given input (e.g., textual prompt). The validity of a prediction set is determined by a user-defined binary admissibility function depending on the target application. For example, requiring at least one program in the set to pass all test cases in code generation application. To address this problem, we develop a simple and effective conformal inference algorithm referred to as {\em Generative Prediction Sets (GPS)}. Given a set of calibration examples and black-box access to a deep generative model, GPS can generate prediction sets with provable guarantees. The key insight behind GPS is to exploit the inherent structure within the distribution over the minimum number of samples needed to obtain an admissible output to develop a simple conformal regression approach over the minimum number of samples. Unlike prior work , the sets generated by GPS do not require iterative sampling at test time, while maintaining strict marginal coverage guarantees. Experiments on multiple datasets for code and math word problems using different large language models demonstrate the efficacy of GPS over state-of-the-art methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-shahrokhi25a, title = {Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression}, author = {Shahrokhi, Hooman and Roy, Devjeet Raj and Yan, Yan and Arnaoudova, Venera and Doppa, Jana}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {3718--3748}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/shahrokhi25a/shahrokhi25a.pdf}, url = {https://proceedings.mlr.press/v286/shahrokhi25a.html}, abstract = {We consider the problem of generating valid and small prediction sets by sampling outputs (e.g., software code and natural language text) from a black-box deep generative model for a given input (e.g., textual prompt). The validity of a prediction set is determined by a user-defined binary admissibility function depending on the target application. For example, requiring at least one program in the set to pass all test cases in code generation application. To address this problem, we develop a simple and effective conformal inference algorithm referred to as {\em Generative Prediction Sets (GPS)}. Given a set of calibration examples and black-box access to a deep generative model, GPS can generate prediction sets with provable guarantees. The key insight behind GPS is to exploit the inherent structure within the distribution over the minimum number of samples needed to obtain an admissible output to develop a simple conformal regression approach over the minimum number of samples. Unlike prior work , the sets generated by GPS do not require iterative sampling at test time, while maintaining strict marginal coverage guarantees. Experiments on multiple datasets for code and math word problems using different large language models demonstrate the efficacy of GPS over state-of-the-art methods.} }
Endnote
%0 Conference Paper %T Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression %A Hooman Shahrokhi %A Devjeet Raj Roy %A Yan Yan %A Venera Arnaoudova %A Jana Doppa %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-shahrokhi25a %I PMLR %P 3718--3748 %U https://proceedings.mlr.press/v286/shahrokhi25a.html %V 286 %X We consider the problem of generating valid and small prediction sets by sampling outputs (e.g., software code and natural language text) from a black-box deep generative model for a given input (e.g., textual prompt). The validity of a prediction set is determined by a user-defined binary admissibility function depending on the target application. For example, requiring at least one program in the set to pass all test cases in code generation application. To address this problem, we develop a simple and effective conformal inference algorithm referred to as {\em Generative Prediction Sets (GPS)}. Given a set of calibration examples and black-box access to a deep generative model, GPS can generate prediction sets with provable guarantees. The key insight behind GPS is to exploit the inherent structure within the distribution over the minimum number of samples needed to obtain an admissible output to develop a simple conformal regression approach over the minimum number of samples. Unlike prior work , the sets generated by GPS do not require iterative sampling at test time, while maintaining strict marginal coverage guarantees. Experiments on multiple datasets for code and math word problems using different large language models demonstrate the efficacy of GPS over state-of-the-art methods.
APA
Shahrokhi, H., Roy, D.R., Yan, Y., Arnaoudova, V. & Doppa, J.. (2025). Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:3718-3748 Available from https://proceedings.mlr.press/v286/shahrokhi25a.html.

Related Material