Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN

Talal Widatalla, Richard W. Shuai, Brian Hie, Possu Huang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:66746-66771, 2025.

Abstract

Leading deep learning-based methods for fixed-backbone protein sequence design do not model protein sidechain conformation during sequence generation despite the large role the three-dimensional arrangement of sidechain atoms play in protein conformation, stability, and overall protein function. Instead, these models implicitly reason about crucial sidechain interactions based solely on backbone geometry and amino-acid sequence. To address this, we present FAMPNN (Full-Atom MPNN), a sequence design method that explicitly models both sequence identity and sidechain conformation for each residue, where the per-token distribution of a residue’s discrete amino acid identity and its continuous sidechain conformation are learned with a combined categorical cross-entropy and diffusion loss objective. We demonstrate learning these distributions jointly is a highly synergistic task that both improves sequence recovery while achieving state-of-the-art sidechain packing. Furthermore, benefits from explicit full-atom modeling generalize from sequence recovery to practical protein design applications, such as zero-shot prediction of experimental binding and stability measurements.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-widatalla25a, title = {Sidechain conditioning and modeling for full-atom protein sequence design with {FAMPNN}}, author = {Widatalla, Talal and Shuai, Richard W. and Hie, Brian and Huang, Possu}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {66746--66771}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/widatalla25a/widatalla25a.pdf}, url = {https://proceedings.mlr.press/v267/widatalla25a.html}, abstract = {Leading deep learning-based methods for fixed-backbone protein sequence design do not model protein sidechain conformation during sequence generation despite the large role the three-dimensional arrangement of sidechain atoms play in protein conformation, stability, and overall protein function. Instead, these models implicitly reason about crucial sidechain interactions based solely on backbone geometry and amino-acid sequence. To address this, we present FAMPNN (Full-Atom MPNN), a sequence design method that explicitly models both sequence identity and sidechain conformation for each residue, where the per-token distribution of a residue’s discrete amino acid identity and its continuous sidechain conformation are learned with a combined categorical cross-entropy and diffusion loss objective. We demonstrate learning these distributions jointly is a highly synergistic task that both improves sequence recovery while achieving state-of-the-art sidechain packing. Furthermore, benefits from explicit full-atom modeling generalize from sequence recovery to practical protein design applications, such as zero-shot prediction of experimental binding and stability measurements.} }
Endnote
%0 Conference Paper %T Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN %A Talal Widatalla %A Richard W. Shuai %A Brian Hie %A Possu Huang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-widatalla25a %I PMLR %P 66746--66771 %U https://proceedings.mlr.press/v267/widatalla25a.html %V 267 %X Leading deep learning-based methods for fixed-backbone protein sequence design do not model protein sidechain conformation during sequence generation despite the large role the three-dimensional arrangement of sidechain atoms play in protein conformation, stability, and overall protein function. Instead, these models implicitly reason about crucial sidechain interactions based solely on backbone geometry and amino-acid sequence. To address this, we present FAMPNN (Full-Atom MPNN), a sequence design method that explicitly models both sequence identity and sidechain conformation for each residue, where the per-token distribution of a residue’s discrete amino acid identity and its continuous sidechain conformation are learned with a combined categorical cross-entropy and diffusion loss objective. We demonstrate learning these distributions jointly is a highly synergistic task that both improves sequence recovery while achieving state-of-the-art sidechain packing. Furthermore, benefits from explicit full-atom modeling generalize from sequence recovery to practical protein design applications, such as zero-shot prediction of experimental binding and stability measurements.
APA
Widatalla, T., Shuai, R.W., Hie, B. & Huang, P.. (2025). Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:66746-66771 Available from https://proceedings.mlr.press/v267/widatalla25a.html.

Related Material