LLM-Augmented Chemical Synthesis and Design Decision Programs

Haorui Wang, Jeff Guo, Lingkai Kong, Rampi Ramprasad, Philippe Schwaller, Yuanqi Du, Chao Zhang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:62980-63001, 2025.

Abstract

Retrosynthesis, the process of breaking down a target molecule into simpler precursors through a series of valid reactions, stands at the core of organic chemistry and drug development. Although recent machine learning (ML) research has advanced single-step retrosynthetic modeling and subsequent route searches, these solutions remain restricted by the extensive combinatorial space of possible pathways. Concurrently, large language models (LLMs) have exhibited remarkable chemical knowledge, hinting at their potential to tackle complex decision-making tasks in chemistry. In this work, we explore whether LLMs can successfully navigate the highly constrained, multi-step retrosynthesis planning problem. We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy, moving beyond the conventional step-by-step reactant prediction. Through comprehensive evaluations, we show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wang25ag, title = {{LLM}-Augmented Chemical Synthesis and Design Decision Programs}, author = {Wang, Haorui and Guo, Jeff and Kong, Lingkai and Ramprasad, Rampi and Schwaller, Philippe and Du, Yuanqi and Zhang, Chao}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {62980--63001}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wang25ag/wang25ag.pdf}, url = {https://proceedings.mlr.press/v267/wang25ag.html}, abstract = {Retrosynthesis, the process of breaking down a target molecule into simpler precursors through a series of valid reactions, stands at the core of organic chemistry and drug development. Although recent machine learning (ML) research has advanced single-step retrosynthetic modeling and subsequent route searches, these solutions remain restricted by the extensive combinatorial space of possible pathways. Concurrently, large language models (LLMs) have exhibited remarkable chemical knowledge, hinting at their potential to tackle complex decision-making tasks in chemistry. In this work, we explore whether LLMs can successfully navigate the highly constrained, multi-step retrosynthesis planning problem. We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy, moving beyond the conventional step-by-step reactant prediction. Through comprehensive evaluations, we show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.} }
Endnote
%0 Conference Paper %T LLM-Augmented Chemical Synthesis and Design Decision Programs %A Haorui Wang %A Jeff Guo %A Lingkai Kong %A Rampi Ramprasad %A Philippe Schwaller %A Yuanqi Du %A Chao Zhang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wang25ag %I PMLR %P 62980--63001 %U https://proceedings.mlr.press/v267/wang25ag.html %V 267 %X Retrosynthesis, the process of breaking down a target molecule into simpler precursors through a series of valid reactions, stands at the core of organic chemistry and drug development. Although recent machine learning (ML) research has advanced single-step retrosynthetic modeling and subsequent route searches, these solutions remain restricted by the extensive combinatorial space of possible pathways. Concurrently, large language models (LLMs) have exhibited remarkable chemical knowledge, hinting at their potential to tackle complex decision-making tasks in chemistry. In this work, we explore whether LLMs can successfully navigate the highly constrained, multi-step retrosynthesis planning problem. We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy, moving beyond the conventional step-by-step reactant prediction. Through comprehensive evaluations, we show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.
APA
Wang, H., Guo, J., Kong, L., Ramprasad, R., Schwaller, P., Du, Y. & Zhang, C.. (2025). LLM-Augmented Chemical Synthesis and Design Decision Programs. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:62980-63001 Available from https://proceedings.mlr.press/v267/wang25ag.html.

Related Material