Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation

Joonho Lee, Jae Oh Woo, Juree Seok, Parisa Hassanzadeh, Wooseok Jang, Juyoun Son, Sima Didari, Baruch Gutow, Heng Hao, Hankyu Moon, Wenjun Hu, Yeong-Dae Kwon, Taehee Lee, Seungjai Min
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:27009-27036, 2024.

Abstract

Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses based on Bayesian approximation. Trained with preference datasets, our uncertainty-enabled proxy not only scores rewards for responses but also evaluates their inherent uncertainty. Empirical results demonstrate significant benefits of incorporating the proposed proxy into language model training. Our method boosts the instruction following capability of language models by refining data curation for training and improving policy optimization objectives, thereby surpassing existing methods by a large margin on benchmarks such as Vicuna and MT-bench. These findings highlight that our proposed approach substantially advances language model training and paves a new way of harnessing uncertainty within language models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-lee24z, title = {Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation}, author = {Lee, Joonho and Woo, Jae Oh and Seok, Juree and Hassanzadeh, Parisa and Jang, Wooseok and Son, Juyoun and Didari, Sima and Gutow, Baruch and Hao, Heng and Moon, Hankyu and Hu, Wenjun and Kwon, Yeong-Dae and Lee, Taehee and Min, Seungjai}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {27009--27036}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lee24z/lee24z.pdf}, url = {https://proceedings.mlr.press/v235/lee24z.html}, abstract = {Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses based on Bayesian approximation. Trained with preference datasets, our uncertainty-enabled proxy not only scores rewards for responses but also evaluates their inherent uncertainty. Empirical results demonstrate significant benefits of incorporating the proposed proxy into language model training. Our method boosts the instruction following capability of language models by refining data curation for training and improving policy optimization objectives, thereby surpassing existing methods by a large margin on benchmarks such as Vicuna and MT-bench. These findings highlight that our proposed approach substantially advances language model training and paves a new way of harnessing uncertainty within language models.} }
Endnote
%0 Conference Paper %T Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation %A Joonho Lee %A Jae Oh Woo %A Juree Seok %A Parisa Hassanzadeh %A Wooseok Jang %A Juyoun Son %A Sima Didari %A Baruch Gutow %A Heng Hao %A Hankyu Moon %A Wenjun Hu %A Yeong-Dae Kwon %A Taehee Lee %A Seungjai Min %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-lee24z %I PMLR %P 27009--27036 %U https://proceedings.mlr.press/v235/lee24z.html %V 235 %X Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses based on Bayesian approximation. Trained with preference datasets, our uncertainty-enabled proxy not only scores rewards for responses but also evaluates their inherent uncertainty. Empirical results demonstrate significant benefits of incorporating the proposed proxy into language model training. Our method boosts the instruction following capability of language models by refining data curation for training and improving policy optimization objectives, thereby surpassing existing methods by a large margin on benchmarks such as Vicuna and MT-bench. These findings highlight that our proposed approach substantially advances language model training and paves a new way of harnessing uncertainty within language models.
APA
Lee, J., Woo, J.O., Seok, J., Hassanzadeh, P., Jang, W., Son, J., Didari, S., Gutow, B., Hao, H., Moon, H., Hu, W., Kwon, Y., Lee, T. & Min, S.. (2024). Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:27009-27036 Available from https://proceedings.mlr.press/v235/lee24z.html.

Related Material