R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents

Daniel D. Johnson, Daniel Tarlow, Christian Walder
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:15262-15306, 2023.

Abstract

Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinations in their output. When used to assist software developers, these models may make mistakes that users must go back and fix, or worse, introduce subtle bugs that users may miss entirely. We propose Randomized Utility-driven Synthesis of Uncertain REgions (R-U-SURE), an approach for building uncertainty-aware suggestions based on a decision-theoretic model of goal-conditioned utility, using random samples from a generative model as a proxy for the unobserved possible intents of the end user. Our technique combines minimum-Bayes-risk decoding, dual decomposition, and decision diagrams in order to efficiently produce structured uncertainty summaries, given only sample access to an arbitrary generative model of code and an optional AST parser. We demonstrate R-U-SURE on three developer-assistance tasks, and show that it can be applied different user interaction patterns without retraining the model and leads to more accurate uncertainty estimates than token-probability baselines. We also release our implementation as an open-source library at https://github.com/google-research/r_u_sure.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-johnson23a, title = {R-U-{SURE}? {U}ncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents}, author = {Johnson, Daniel D. and Tarlow, Daniel and Walder, Christian}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {15262--15306}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/johnson23a/johnson23a.pdf}, url = {https://proceedings.mlr.press/v202/johnson23a.html}, abstract = {Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinations in their output. When used to assist software developers, these models may make mistakes that users must go back and fix, or worse, introduce subtle bugs that users may miss entirely. We propose Randomized Utility-driven Synthesis of Uncertain REgions (R-U-SURE), an approach for building uncertainty-aware suggestions based on a decision-theoretic model of goal-conditioned utility, using random samples from a generative model as a proxy for the unobserved possible intents of the end user. Our technique combines minimum-Bayes-risk decoding, dual decomposition, and decision diagrams in order to efficiently produce structured uncertainty summaries, given only sample access to an arbitrary generative model of code and an optional AST parser. We demonstrate R-U-SURE on three developer-assistance tasks, and show that it can be applied different user interaction patterns without retraining the model and leads to more accurate uncertainty estimates than token-probability baselines. We also release our implementation as an open-source library at https://github.com/google-research/r_u_sure.} }
Endnote
%0 Conference Paper %T R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents %A Daniel D. Johnson %A Daniel Tarlow %A Christian Walder %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-johnson23a %I PMLR %P 15262--15306 %U https://proceedings.mlr.press/v202/johnson23a.html %V 202 %X Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinations in their output. When used to assist software developers, these models may make mistakes that users must go back and fix, or worse, introduce subtle bugs that users may miss entirely. We propose Randomized Utility-driven Synthesis of Uncertain REgions (R-U-SURE), an approach for building uncertainty-aware suggestions based on a decision-theoretic model of goal-conditioned utility, using random samples from a generative model as a proxy for the unobserved possible intents of the end user. Our technique combines minimum-Bayes-risk decoding, dual decomposition, and decision diagrams in order to efficiently produce structured uncertainty summaries, given only sample access to an arbitrary generative model of code and an optional AST parser. We demonstrate R-U-SURE on three developer-assistance tasks, and show that it can be applied different user interaction patterns without retraining the model and leads to more accurate uncertainty estimates than token-probability baselines. We also release our implementation as an open-source library at https://github.com/google-research/r_u_sure.
APA
Johnson, D.D., Tarlow, D. & Walder, C.. (2023). R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:15262-15306 Available from https://proceedings.mlr.press/v202/johnson23a.html.

Related Material