Structured Multi-task Learning for Molecular Property Prediction

Shengchao Liu, Meng Qu, Zuobai Zhang, Huiyu Cai, Jian Tang
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:8906-8920, 2022.

Abstract

Multi-task learning for molecular property prediction is becoming increasingly important in drug discovery. However, in contrast to other domains, the performance of multi-task learning in drug discovery is still not satisfying as the number of labeled data for each task is too limited, which calls for additional data to complement the data scarcity. In this paper, we study multi-task learning for molecular property prediction in a novel setting, where a relation graph between tasks is available. We first construct a dataset including around 400 tasks as well as a task relation graph. Then to better utilize such relation graph, we propose a method called SGNN-EBM to systematically investigate the structured task modeling from two perspectives. (1) In the latent space, we model the task representations by applying a state graph neural network (SGNN) on the relation graph. (2) In the output space, we employ structured prediction with the energy-based model (EBM), which can be efficiently trained through noise-contrastive estimation (NCE) approach. Empirical results justify the effectiveness of SGNN-EBM. Code is available on https://github.com/chao1224/SGNN-EBM.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-liu22e, title = { Structured Multi-task Learning for Molecular Property Prediction }, author = {Liu, Shengchao and Qu, Meng and Zhang, Zuobai and Cai, Huiyu and Tang, Jian}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {8906--8920}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/liu22e/liu22e.pdf}, url = {https://proceedings.mlr.press/v151/liu22e.html}, abstract = { Multi-task learning for molecular property prediction is becoming increasingly important in drug discovery. However, in contrast to other domains, the performance of multi-task learning in drug discovery is still not satisfying as the number of labeled data for each task is too limited, which calls for additional data to complement the data scarcity. In this paper, we study multi-task learning for molecular property prediction in a novel setting, where a relation graph between tasks is available. We first construct a dataset including around 400 tasks as well as a task relation graph. Then to better utilize such relation graph, we propose a method called SGNN-EBM to systematically investigate the structured task modeling from two perspectives. (1) In the latent space, we model the task representations by applying a state graph neural network (SGNN) on the relation graph. (2) In the output space, we employ structured prediction with the energy-based model (EBM), which can be efficiently trained through noise-contrastive estimation (NCE) approach. Empirical results justify the effectiveness of SGNN-EBM. Code is available on https://github.com/chao1224/SGNN-EBM. } }
Endnote
%0 Conference Paper %T Structured Multi-task Learning for Molecular Property Prediction %A Shengchao Liu %A Meng Qu %A Zuobai Zhang %A Huiyu Cai %A Jian Tang %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-liu22e %I PMLR %P 8906--8920 %U https://proceedings.mlr.press/v151/liu22e.html %V 151 %X Multi-task learning for molecular property prediction is becoming increasingly important in drug discovery. However, in contrast to other domains, the performance of multi-task learning in drug discovery is still not satisfying as the number of labeled data for each task is too limited, which calls for additional data to complement the data scarcity. In this paper, we study multi-task learning for molecular property prediction in a novel setting, where a relation graph between tasks is available. We first construct a dataset including around 400 tasks as well as a task relation graph. Then to better utilize such relation graph, we propose a method called SGNN-EBM to systematically investigate the structured task modeling from two perspectives. (1) In the latent space, we model the task representations by applying a state graph neural network (SGNN) on the relation graph. (2) In the output space, we employ structured prediction with the energy-based model (EBM), which can be efficiently trained through noise-contrastive estimation (NCE) approach. Empirical results justify the effectiveness of SGNN-EBM. Code is available on https://github.com/chao1224/SGNN-EBM.
APA
Liu, S., Qu, M., Zhang, Z., Cai, H. & Tang, J.. (2022). Structured Multi-task Learning for Molecular Property Prediction . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:8906-8920 Available from https://proceedings.mlr.press/v151/liu22e.html.

Related Material