On Overcoming Miscalibrated Conversational Priors in LLM-based ChatBots

Christine Herlihy; Jennifer Neville; Tobias Schnabel; Adith Swaminathan

On Overcoming Miscalibrated Conversational Priors in LLM-based ChatBots

Christine Herlihy, Jennifer Neville, Tobias Schnabel, Adith Swaminathan

Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:1599-1620, 2024.

Abstract

We explore the use of Large Language Model (LLM-based) chatbots to power recommender systems. We observe that the chatbots respond poorly when they encounter under-specified requests (e.g., they make incorrect assumptions, hedge with a long response, or refuse to answer). We conjecture that such miscalibrated response tendencies (i.e., conversational priors) can be attributed to LLM fine-tuning by annotators — single-turn annotations may not capture multi-turn conversation utility, and the annotators’ preferences may not even be representative of users interacting with a recommender system. We first analyze public LLM chat logs to conclude that query under-specification is common. Next, we study synthetic recommendation problems with known but latent item utilities, and frame them as Partially Observed Decision Processes (PODP). We find that pre-trained LLMs can be sub-optimal for PODPs and derive better policies that clarify under-specified queries when appropriate. Then, we re-calibrate LLMs by prompting them with learned control messages to approximate the improved policy. Finally, we show empirically that our lightweight learning approach effectively uses logged conversation data to re-calibrate the response strategies of LLM-based chatbots for recommendation tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v244-herlihy24a,
  title = 	 {On Overcoming Miscalibrated Conversational Priors in LLM-based ChatBots},
  author =       {Herlihy, Christine and Neville, Jennifer and Schnabel, Tobias and Swaminathan, Adith},
  booktitle = 	 {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {1599--1620},
  year = 	 {2024},
  editor = 	 {Kiyavash, Negar and Mooij, Joris M.},
  volume = 	 {244},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v244/main/assets/herlihy24a/herlihy24a.pdf},
  url = 	 {https://proceedings.mlr.press/v244/herlihy24a.html},
  abstract = 	 {We explore the use of Large Language Model (LLM-based) chatbots to power recommender systems. We observe that the chatbots respond poorly when they encounter under-specified requests  (e.g., they make incorrect assumptions, hedge with a long response, or refuse to answer). We conjecture that such miscalibrated response tendencies (i.e., conversational priors) can be attributed to LLM fine-tuning by annotators — single-turn annotations may not capture multi-turn conversation utility, and the annotators’ preferences may not even be representative of users interacting with a recommender system.  We first analyze public LLM chat logs to conclude that query under-specification is common. Next, we study synthetic recommendation problems with known but latent item utilities, and frame them as Partially Observed Decision Processes (PODP). We find that pre-trained LLMs can be sub-optimal for PODPs and derive better policies that clarify under-specified queries when appropriate. Then, we re-calibrate LLMs by prompting them with learned control messages to approximate the improved policy. Finally, we show empirically that our lightweight learning approach effectively uses logged conversation data to re-calibrate the response strategies of LLM-based chatbots for recommendation tasks.}
}

Endnote

%0 Conference Paper
%T On Overcoming Miscalibrated Conversational Priors in LLM-based ChatBots
%A Christine Herlihy
%A Jennifer Neville
%A Tobias Schnabel
%A Adith Swaminathan
%B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2024
%E Negar Kiyavash
%E Joris M. Mooij	
%F pmlr-v244-herlihy24a
%I PMLR
%P 1599--1620
%U https://proceedings.mlr.press/v244/herlihy24a.html
%V 244
%X We explore the use of Large Language Model (LLM-based) chatbots to power recommender systems. We observe that the chatbots respond poorly when they encounter under-specified requests  (e.g., they make incorrect assumptions, hedge with a long response, or refuse to answer). We conjecture that such miscalibrated response tendencies (i.e., conversational priors) can be attributed to LLM fine-tuning by annotators — single-turn annotations may not capture multi-turn conversation utility, and the annotators’ preferences may not even be representative of users interacting with a recommender system.  We first analyze public LLM chat logs to conclude that query under-specification is common. Next, we study synthetic recommendation problems with known but latent item utilities, and frame them as Partially Observed Decision Processes (PODP). We find that pre-trained LLMs can be sub-optimal for PODPs and derive better policies that clarify under-specified queries when appropriate. Then, we re-calibrate LLMs by prompting them with learned control messages to approximate the improved policy. Finally, we show empirically that our lightweight learning approach effectively uses logged conversation data to re-calibrate the response strategies of LLM-based chatbots for recommendation tasks.

APA


Herlihy, C., Neville, J., Schnabel, T. & Swaminathan, A.. (2024). On Overcoming Miscalibrated Conversational Priors in LLM-based ChatBots. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:1599-1620 Available from https://proceedings.mlr.press/v244/herlihy24a.html.

On Overcoming Miscalibrated Conversational Priors in LLM-based ChatBots

Abstract

Cite this Paper

Related Material