Position: LLMs Need a Bayesian Meta-Reasoning Framework for More Robust and Generalizable Reasoning

Hanqi Yan, Linhai Zhang, Jiazheng Li, Zhenyi Shen, Yulan He
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:82360-82383, 2025.

Abstract

Large language models (LLMs) excel in many reasoning tasks but continue to face significant challenges, such as lack of robustness in reasoning, struggling with cross-task generalization, and inefficiencies in scaling up reasoning capabilities. Current training paradigms, including next-token prediction and reinforcement learning from human feedback, often fall short in adaptability to diverse reasoning tasks. Existing approaches, such as prompt optimization and iterative output refinement, offer performance improvement, but can be inefficient and lack effective generalization. To overcome these limitations, this position paper argues for a transformative shift in how LLMs approach reasoning. Drawing inspiration from cognitive science, particularly meta-reasoning theories such as Dual-Process Theory and Metacognitive Reasoning, we propose a Bayesian meta-reasoning framework for LLMs. Our approach integrates self-awareness, monitoring, evaluation, regulation, and meta-reflection, to enhance LLMs’ ability to refine reasoning strategies and generalize across tasks. We revisit existing LLM reasoning methods, identify key challenges, and suggest directions for future research.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-yan25g, title = {Position: {LLM}s Need a {B}ayesian Meta-Reasoning Framework for More Robust and Generalizable Reasoning}, author = {Yan, Hanqi and Zhang, Linhai and Li, Jiazheng and Shen, Zhenyi and He, Yulan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {82360--82383}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/yan25g/yan25g.pdf}, url = {https://proceedings.mlr.press/v267/yan25g.html}, abstract = {Large language models (LLMs) excel in many reasoning tasks but continue to face significant challenges, such as lack of robustness in reasoning, struggling with cross-task generalization, and inefficiencies in scaling up reasoning capabilities. Current training paradigms, including next-token prediction and reinforcement learning from human feedback, often fall short in adaptability to diverse reasoning tasks. Existing approaches, such as prompt optimization and iterative output refinement, offer performance improvement, but can be inefficient and lack effective generalization. To overcome these limitations, this position paper argues for a transformative shift in how LLMs approach reasoning. Drawing inspiration from cognitive science, particularly meta-reasoning theories such as Dual-Process Theory and Metacognitive Reasoning, we propose a Bayesian meta-reasoning framework for LLMs. Our approach integrates self-awareness, monitoring, evaluation, regulation, and meta-reflection, to enhance LLMs’ ability to refine reasoning strategies and generalize across tasks. We revisit existing LLM reasoning methods, identify key challenges, and suggest directions for future research.} }
Endnote
%0 Conference Paper %T Position: LLMs Need a Bayesian Meta-Reasoning Framework for More Robust and Generalizable Reasoning %A Hanqi Yan %A Linhai Zhang %A Jiazheng Li %A Zhenyi Shen %A Yulan He %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-yan25g %I PMLR %P 82360--82383 %U https://proceedings.mlr.press/v267/yan25g.html %V 267 %X Large language models (LLMs) excel in many reasoning tasks but continue to face significant challenges, such as lack of robustness in reasoning, struggling with cross-task generalization, and inefficiencies in scaling up reasoning capabilities. Current training paradigms, including next-token prediction and reinforcement learning from human feedback, often fall short in adaptability to diverse reasoning tasks. Existing approaches, such as prompt optimization and iterative output refinement, offer performance improvement, but can be inefficient and lack effective generalization. To overcome these limitations, this position paper argues for a transformative shift in how LLMs approach reasoning. Drawing inspiration from cognitive science, particularly meta-reasoning theories such as Dual-Process Theory and Metacognitive Reasoning, we propose a Bayesian meta-reasoning framework for LLMs. Our approach integrates self-awareness, monitoring, evaluation, regulation, and meta-reflection, to enhance LLMs’ ability to refine reasoning strategies and generalize across tasks. We revisit existing LLM reasoning methods, identify key challenges, and suggest directions for future research.
APA
Yan, H., Zhang, L., Li, J., Shen, Z. & He, Y.. (2025). Position: LLMs Need a Bayesian Meta-Reasoning Framework for More Robust and Generalizable Reasoning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:82360-82383 Available from https://proceedings.mlr.press/v267/yan25g.html.

Related Material