Geometric Latent Diffusion Models for 3D Molecule Generation

Minkai Xu, Alexander S Powers, Ron O. Dror, Stefano Ermon, Jure Leskovec
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:38592-38610, 2023.

Abstract

Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design. Inspired by the recent huge success of Stable (latent) Diffusion models, we propose a novel and principled method for 3D molecule generation named Geometric Latent Diffusion Models (GeoLDM). GeoLDM is the first latent DM model for the molecular geometry domain, composed of autoencoders encoding structures into continuous latent codes and DMs operating in the latent space. Our key innovation is that for modeling the 3D molecular geometries, we capture its critical roto-translational equivariance constraints by building a point-structured latent space with both invariant scalars and equivariant tensors. Extensive experiments demonstrate that GeoLDM can consistently achieve better performance on multiple molecule generation benchmarks, with up to 7% improvement for the valid percentage of large biomolecules. Results also demonstrate GeoLDM’s higher capacity for controllable generation thanks to the latent modeling. Code is provided at https://github.com/MinkaiXu/GeoLDM.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-xu23n, title = {Geometric Latent Diffusion Models for 3{D} Molecule Generation}, author = {Xu, Minkai and Powers, Alexander S and Dror, Ron O. and Ermon, Stefano and Leskovec, Jure}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {38592--38610}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/xu23n/xu23n.pdf}, url = {https://proceedings.mlr.press/v202/xu23n.html}, abstract = {Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design. Inspired by the recent huge success of Stable (latent) Diffusion models, we propose a novel and principled method for 3D molecule generation named Geometric Latent Diffusion Models (GeoLDM). GeoLDM is the first latent DM model for the molecular geometry domain, composed of autoencoders encoding structures into continuous latent codes and DMs operating in the latent space. Our key innovation is that for modeling the 3D molecular geometries, we capture its critical roto-translational equivariance constraints by building a point-structured latent space with both invariant scalars and equivariant tensors. Extensive experiments demonstrate that GeoLDM can consistently achieve better performance on multiple molecule generation benchmarks, with up to 7% improvement for the valid percentage of large biomolecules. Results also demonstrate GeoLDM’s higher capacity for controllable generation thanks to the latent modeling. Code is provided at https://github.com/MinkaiXu/GeoLDM.} }
Endnote
%0 Conference Paper %T Geometric Latent Diffusion Models for 3D Molecule Generation %A Minkai Xu %A Alexander S Powers %A Ron O. Dror %A Stefano Ermon %A Jure Leskovec %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-xu23n %I PMLR %P 38592--38610 %U https://proceedings.mlr.press/v202/xu23n.html %V 202 %X Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design. Inspired by the recent huge success of Stable (latent) Diffusion models, we propose a novel and principled method for 3D molecule generation named Geometric Latent Diffusion Models (GeoLDM). GeoLDM is the first latent DM model for the molecular geometry domain, composed of autoencoders encoding structures into continuous latent codes and DMs operating in the latent space. Our key innovation is that for modeling the 3D molecular geometries, we capture its critical roto-translational equivariance constraints by building a point-structured latent space with both invariant scalars and equivariant tensors. Extensive experiments demonstrate that GeoLDM can consistently achieve better performance on multiple molecule generation benchmarks, with up to 7% improvement for the valid percentage of large biomolecules. Results also demonstrate GeoLDM’s higher capacity for controllable generation thanks to the latent modeling. Code is provided at https://github.com/MinkaiXu/GeoLDM.
APA
Xu, M., Powers, A.S., Dror, R.O., Ermon, S. & Leskovec, J.. (2023). Geometric Latent Diffusion Models for 3D Molecule Generation. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:38592-38610 Available from https://proceedings.mlr.press/v202/xu23n.html.

Related Material