CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes

Peter Mikhael, Itamar Chinn, Regina Barzilay
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:35647-35663, 2024.

Abstract

Computational screening of naturally occurring proteins has the potential to identify efficient catalysts among the hundreds of millions of sequences that remain uncharacterized. Current experimental methods remain time, cost and labor intensive, limiting the number of enzymes they can reasonably screen. In this work, we propose a computational framework for in-silico enzyme screening. Through a contrastive objective, we train CLIPZyme to encode and align representations of enzyme structures and reaction pairs. With no standard computational baseline, we compare CLIPZyme to existing EC (enzyme commission) predictors applied to virtual enzyme screening and show improved performance in scenarios where limited information on the reaction is available (BEDROC$_{85}$ of 44.69%). Additionally, we evaluate combining EC predictors with CLIPZyme and show its generalization capacity on both unseen reactions and protein clusters.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-mikhael24a, title = {{CLIPZ}yme: Reaction-Conditioned Virtual Screening of Enzymes}, author = {Mikhael, Peter and Chinn, Itamar and Barzilay, Regina}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {35647--35663}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/mikhael24a/mikhael24a.pdf}, url = {https://proceedings.mlr.press/v235/mikhael24a.html}, abstract = {Computational screening of naturally occurring proteins has the potential to identify efficient catalysts among the hundreds of millions of sequences that remain uncharacterized. Current experimental methods remain time, cost and labor intensive, limiting the number of enzymes they can reasonably screen. In this work, we propose a computational framework for in-silico enzyme screening. Through a contrastive objective, we train CLIPZyme to encode and align representations of enzyme structures and reaction pairs. With no standard computational baseline, we compare CLIPZyme to existing EC (enzyme commission) predictors applied to virtual enzyme screening and show improved performance in scenarios where limited information on the reaction is available (BEDROC$_{85}$ of 44.69%). Additionally, we evaluate combining EC predictors with CLIPZyme and show its generalization capacity on both unseen reactions and protein clusters.} }
Endnote
%0 Conference Paper %T CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes %A Peter Mikhael %A Itamar Chinn %A Regina Barzilay %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-mikhael24a %I PMLR %P 35647--35663 %U https://proceedings.mlr.press/v235/mikhael24a.html %V 235 %X Computational screening of naturally occurring proteins has the potential to identify efficient catalysts among the hundreds of millions of sequences that remain uncharacterized. Current experimental methods remain time, cost and labor intensive, limiting the number of enzymes they can reasonably screen. In this work, we propose a computational framework for in-silico enzyme screening. Through a contrastive objective, we train CLIPZyme to encode and align representations of enzyme structures and reaction pairs. With no standard computational baseline, we compare CLIPZyme to existing EC (enzyme commission) predictors applied to virtual enzyme screening and show improved performance in scenarios where limited information on the reaction is available (BEDROC$_{85}$ of 44.69%). Additionally, we evaluate combining EC predictors with CLIPZyme and show its generalization capacity on both unseen reactions and protein clusters.
APA
Mikhael, P., Chinn, I. & Barzilay, R.. (2024). CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:35647-35663 Available from https://proceedings.mlr.press/v235/mikhael24a.html.

Related Material