A generative recommender system with GMM prior for cancer drug generation and sensitivity prediction

Krzysztof Koras, Marcin Možejko, Paulina Szymczak, Adam Izdebski, Eike Staub, Ewa Szczurek
Proceedings of the 17th Machine Learning in Computational Biology meeting, PMLR 200:61-73, 2022.

Abstract

Recent emergence of high-throughput drug screening assays sparkled an intensive development of machine learning methods, including models for prediction of sensitivity of cancer cell lines to anti-cancer drugs, as well as methods for generation of potential drug candidates. However, the concept of generating compounds with specific properties and simultaneous modeling of their efficacy against cancer cell lines has not been comprehensively explored. To address this need, we present VADEERS, a Variational Autoencoder-based Drug Efficacy Estimation Recommender System. The generation of compounds is performed by a novel variational autoencoder with a semi-supervised Gaussian mixture model (GMM) prior. The prior defines a clustering in the latent space, where the clusters are associated with specific drug properties. In addition, VADEERS is equipped with a cell line autoencoder and a sensitivity prediction network. The model combines data for SMILES string representations of anti-cancer drugs, their inhibition profiles against a panel of protein kinases, cell lines{’} biological features and measurements of the sensitivity of the cell lines to the drugs. The evaluated variants of VADEERS achieve a high r=0.87 Pearson correlation between true and predicted drug sensitivity estimates. We show that the learned latent representations and new generated data points accurately reflect the given clustering. In summary, VADEERS offers a comprehensive model of drugs{’} and cell lines{’} properties and relationships between them, as well as a guided generation of novel compounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v200-koras22a, title = {A generative recommender system with GMM prior for cancer drug generation and sensitivity prediction}, author = {Koras, Krzysztof and Mo\v{z}ejko, Marcin and Szymczak, Paulina and Izdebski, Adam and Staub, Eike and Szczurek, Ewa}, booktitle = {Proceedings of the 17th Machine Learning in Computational Biology meeting}, pages = {61--73}, year = {2022}, editor = {Knowles, David A and Mostafavi, Sara and Lee, Su-In}, volume = {200}, series = {Proceedings of Machine Learning Research}, month = {21--22 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v200/koras22a/koras22a.pdf}, url = {https://proceedings.mlr.press/v200/koras22a.html}, abstract = {Recent emergence of high-throughput drug screening assays sparkled an intensive development of machine learning methods, including models for prediction of sensitivity of cancer cell lines to anti-cancer drugs, as well as methods for generation of potential drug candidates. However, the concept of generating compounds with specific properties and simultaneous modeling of their efficacy against cancer cell lines has not been comprehensively explored. To address this need, we present VADEERS, a Variational Autoencoder-based Drug Efficacy Estimation Recommender System. The generation of compounds is performed by a novel variational autoencoder with a semi-supervised Gaussian mixture model (GMM) prior. The prior defines a clustering in the latent space, where the clusters are associated with specific drug properties. In addition, VADEERS is equipped with a cell line autoencoder and a sensitivity prediction network. The model combines data for SMILES string representations of anti-cancer drugs, their inhibition profiles against a panel of protein kinases, cell lines{’} biological features and measurements of the sensitivity of the cell lines to the drugs. The evaluated variants of VADEERS achieve a high r=0.87 Pearson correlation between true and predicted drug sensitivity estimates. We show that the learned latent representations and new generated data points accurately reflect the given clustering. In summary, VADEERS offers a comprehensive model of drugs{’} and cell lines{’} properties and relationships between them, as well as a guided generation of novel compounds.} }
Endnote
%0 Conference Paper %T A generative recommender system with GMM prior for cancer drug generation and sensitivity prediction %A Krzysztof Koras %A Marcin Možejko %A Paulina Szymczak %A Adam Izdebski %A Eike Staub %A Ewa Szczurek %B Proceedings of the 17th Machine Learning in Computational Biology meeting %C Proceedings of Machine Learning Research %D 2022 %E David A Knowles %E Sara Mostafavi %E Su-In Lee %F pmlr-v200-koras22a %I PMLR %P 61--73 %U https://proceedings.mlr.press/v200/koras22a.html %V 200 %X Recent emergence of high-throughput drug screening assays sparkled an intensive development of machine learning methods, including models for prediction of sensitivity of cancer cell lines to anti-cancer drugs, as well as methods for generation of potential drug candidates. However, the concept of generating compounds with specific properties and simultaneous modeling of their efficacy against cancer cell lines has not been comprehensively explored. To address this need, we present VADEERS, a Variational Autoencoder-based Drug Efficacy Estimation Recommender System. The generation of compounds is performed by a novel variational autoencoder with a semi-supervised Gaussian mixture model (GMM) prior. The prior defines a clustering in the latent space, where the clusters are associated with specific drug properties. In addition, VADEERS is equipped with a cell line autoencoder and a sensitivity prediction network. The model combines data for SMILES string representations of anti-cancer drugs, their inhibition profiles against a panel of protein kinases, cell lines{’} biological features and measurements of the sensitivity of the cell lines to the drugs. The evaluated variants of VADEERS achieve a high r=0.87 Pearson correlation between true and predicted drug sensitivity estimates. We show that the learned latent representations and new generated data points accurately reflect the given clustering. In summary, VADEERS offers a comprehensive model of drugs{’} and cell lines{’} properties and relationships between them, as well as a guided generation of novel compounds.
APA
Koras, K., Možejko, M., Szymczak, P., Izdebski, A., Staub, E. & Szczurek, E.. (2022). A generative recommender system with GMM prior for cancer drug generation and sensitivity prediction. Proceedings of the 17th Machine Learning in Computational Biology meeting, in Proceedings of Machine Learning Research 200:61-73 Available from https://proceedings.mlr.press/v200/koras22a.html.

Related Material