Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Qinglin Zhu, Runcong Zhao, Hanqi Yan, Yulan He, Yudong Chen, Lin Gui
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:80427-80447, 2025.

Abstract

Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide generation. It combines (1) embedding perturbation for controlled exploration and (2) Bayesian optimisation to refine embeddings via a verifier-guided objective, balancing exploration and exploitation. This approach improves reasoning accuracy and coherence while avoiding reliance on heuristic search. Experiments demonstrate superior correctness with minimal computation, making it a scalable, model-agnostic solution.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-zhu25ae, title = {Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration}, author = {Zhu, Qinglin and Zhao, Runcong and Yan, Hanqi and He, Yulan and Chen, Yudong and Gui, Lin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {80427--80447}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhu25ae/zhu25ae.pdf}, url = {https://proceedings.mlr.press/v267/zhu25ae.html}, abstract = {Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide generation. It combines (1) embedding perturbation for controlled exploration and (2) Bayesian optimisation to refine embeddings via a verifier-guided objective, balancing exploration and exploitation. This approach improves reasoning accuracy and coherence while avoiding reliance on heuristic search. Experiments demonstrate superior correctness with minimal computation, making it a scalable, model-agnostic solution.} }
Endnote
%0 Conference Paper %T Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration %A Qinglin Zhu %A Runcong Zhao %A Hanqi Yan %A Yulan He %A Yudong Chen %A Lin Gui %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-zhu25ae %I PMLR %P 80427--80447 %U https://proceedings.mlr.press/v267/zhu25ae.html %V 267 %X Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide generation. It combines (1) embedding perturbation for controlled exploration and (2) Bayesian optimisation to refine embeddings via a verifier-guided objective, balancing exploration and exploitation. This approach improves reasoning accuracy and coherence while avoiding reliance on heuristic search. Experiments demonstrate superior correctness with minimal computation, making it a scalable, model-agnostic solution.
APA
Zhu, Q., Zhao, R., Yan, H., He, Y., Chen, Y. & Gui, L.. (2025). Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:80427-80447 Available from https://proceedings.mlr.press/v267/zhu25ae.html.

Related Material