Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling

Weijia Xu, Andrzej Banburski, Nebojsa Jojic
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:54852-54865, 2024.

Abstract

We introduce Reprompting, an iterative sampling algorithm that automatically learns the Chain-of-Thought (CoT) recipes for a given task without human intervention. Through Gibbs sampling, Reprompting infers the CoT recipes that work consistently well for a set of training samples by iteratively sampling new recipes using previously sampled recipes as parent prompts to solve other training problems. We conduct extensive experiments on 20 challenging reasoning tasks. Results show that Reprompting outperforms human-written CoT prompts substantially by +9.4 points on average. It also achieves consistently better performance than the state-of-the-art prompt optimization and decoding algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-xu24b, title = {Reprompting: Automated Chain-of-Thought Prompt Inference Through {G}ibbs Sampling}, author = {Xu, Weijia and Banburski, Andrzej and Jojic, Nebojsa}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {54852--54865}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/xu24b/xu24b.pdf}, url = {https://proceedings.mlr.press/v235/xu24b.html}, abstract = {We introduce Reprompting, an iterative sampling algorithm that automatically learns the Chain-of-Thought (CoT) recipes for a given task without human intervention. Through Gibbs sampling, Reprompting infers the CoT recipes that work consistently well for a set of training samples by iteratively sampling new recipes using previously sampled recipes as parent prompts to solve other training problems. We conduct extensive experiments on 20 challenging reasoning tasks. Results show that Reprompting outperforms human-written CoT prompts substantially by +9.4 points on average. It also achieves consistently better performance than the state-of-the-art prompt optimization and decoding algorithms.} }
Endnote
%0 Conference Paper %T Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling %A Weijia Xu %A Andrzej Banburski %A Nebojsa Jojic %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-xu24b %I PMLR %P 54852--54865 %U https://proceedings.mlr.press/v235/xu24b.html %V 235 %X We introduce Reprompting, an iterative sampling algorithm that automatically learns the Chain-of-Thought (CoT) recipes for a given task without human intervention. Through Gibbs sampling, Reprompting infers the CoT recipes that work consistently well for a set of training samples by iteratively sampling new recipes using previously sampled recipes as parent prompts to solve other training problems. We conduct extensive experiments on 20 challenging reasoning tasks. Results show that Reprompting outperforms human-written CoT prompts substantially by +9.4 points on average. It also achieves consistently better performance than the state-of-the-art prompt optimization and decoding algorithms.
APA
Xu, W., Banburski, A. & Jojic, N.. (2024). Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:54852-54865 Available from https://proceedings.mlr.press/v235/xu24b.html.

Related Material