Learning Condensed Graph via Differentiable Atom Mapping for Reaction Yield Prediction

Ankit Ghosh, Gargee Kashyap, Sarthak Mittal, Nupur Jain, Raghavan B Sunoj, Abir De
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:19327-19357, 2025.

Abstract

Yield of chemical reactions generally depends on the activation barrier, i.e., the energy difference between the reactant and the transition state. Computing the transition state from the reactant and product graphs requires prior knowledge of the correct node alignment (i.e., atom mapping), which is not available in yield prediction datasets. In this work, we propose YieldNet, a neural yield prediction model, which tackles these challenges. Here, we first approximate the atom mapping between the reactants and products using a differentiable node alignment network. We then use this approximate atom mapping to obtain a noisy realization of the condensed graph of reaction (CGR), which is a supergraph encompassing both the reactants and products. This CGR serves as a surrogate for the transition state graph structure. The CGR embeddings of different steps in a multi-step reaction are then passed into a transformer-guided reaction path encoder. Our experiments show that YieldNet can predict the yield more accurately than the baselines. Furthermore, the model is trained only under the distant supervision of yield values, without requiring fine-grained supervision of atom mapping.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-ghosh25a, title = {Learning Condensed Graph via Differentiable Atom Mapping for Reaction Yield Prediction}, author = {Ghosh, Ankit and Kashyap, Gargee and Mittal, Sarthak and Jain, Nupur and Sunoj, Raghavan B and De, Abir}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {19327--19357}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/ghosh25a/ghosh25a.pdf}, url = {https://proceedings.mlr.press/v267/ghosh25a.html}, abstract = {Yield of chemical reactions generally depends on the activation barrier, i.e., the energy difference between the reactant and the transition state. Computing the transition state from the reactant and product graphs requires prior knowledge of the correct node alignment (i.e., atom mapping), which is not available in yield prediction datasets. In this work, we propose YieldNet, a neural yield prediction model, which tackles these challenges. Here, we first approximate the atom mapping between the reactants and products using a differentiable node alignment network. We then use this approximate atom mapping to obtain a noisy realization of the condensed graph of reaction (CGR), which is a supergraph encompassing both the reactants and products. This CGR serves as a surrogate for the transition state graph structure. The CGR embeddings of different steps in a multi-step reaction are then passed into a transformer-guided reaction path encoder. Our experiments show that YieldNet can predict the yield more accurately than the baselines. Furthermore, the model is trained only under the distant supervision of yield values, without requiring fine-grained supervision of atom mapping.} }
Endnote
%0 Conference Paper %T Learning Condensed Graph via Differentiable Atom Mapping for Reaction Yield Prediction %A Ankit Ghosh %A Gargee Kashyap %A Sarthak Mittal %A Nupur Jain %A Raghavan B Sunoj %A Abir De %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-ghosh25a %I PMLR %P 19327--19357 %U https://proceedings.mlr.press/v267/ghosh25a.html %V 267 %X Yield of chemical reactions generally depends on the activation barrier, i.e., the energy difference between the reactant and the transition state. Computing the transition state from the reactant and product graphs requires prior knowledge of the correct node alignment (i.e., atom mapping), which is not available in yield prediction datasets. In this work, we propose YieldNet, a neural yield prediction model, which tackles these challenges. Here, we first approximate the atom mapping between the reactants and products using a differentiable node alignment network. We then use this approximate atom mapping to obtain a noisy realization of the condensed graph of reaction (CGR), which is a supergraph encompassing both the reactants and products. This CGR serves as a surrogate for the transition state graph structure. The CGR embeddings of different steps in a multi-step reaction are then passed into a transformer-guided reaction path encoder. Our experiments show that YieldNet can predict the yield more accurately than the baselines. Furthermore, the model is trained only under the distant supervision of yield values, without requiring fine-grained supervision of atom mapping.
APA
Ghosh, A., Kashyap, G., Mittal, S., Jain, N., Sunoj, R.B. & De, A.. (2025). Learning Condensed Graph via Differentiable Atom Mapping for Reaction Yield Prediction. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:19327-19357 Available from https://proceedings.mlr.press/v267/ghosh25a.html.

Related Material