Uncertainty-aware pseudo-labeling for quantum calculations

Kexin Huang, Vishnu Sresht, Brajesh Rai, Mykola Bordyuh
Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PMLR 180:853-862, 2022.

Abstract

Machine learning models have recently shown promise in predicting molecular quantum chemical properties. However, the path to real-life adoption requires (1) learning under low-resource constraints and (2) out-of-distribution generalization to unseen, structurally diverse molecules. We observe that these two challenges can be addressed via abundant labels, which is often not the case in quantum chemistry. We hypothesize that pseudo-labeling on a vast array of unlabeled molecules can serve as gold-label proxies to expand the training labeled dataset significantly. The challenge in pseudo-labeling is to prevent the bad pseudo-labels from biasing the model. Motivated by the entropy minimization framework, we develop a simple and effective strategy Pseudo that can assign pseudo-labels, detect bad pseudo-labels through evidential uncertainty, and prevent them from biasing the model using adaptive weighting. Empirically, Pseudo improves quantum calculations accuracy in full data, low data, and out-of-distribution settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v180-huang22a, title = {Uncertainty-aware pseudo-labeling for quantum calculations}, author = {Huang, Kexin and Sresht, Vishnu and Rai, Brajesh and Bordyuh, Mykola}, booktitle = {Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence}, pages = {853--862}, year = {2022}, editor = {Cussens, James and Zhang, Kun}, volume = {180}, series = {Proceedings of Machine Learning Research}, month = {01--05 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v180/huang22a/huang22a.pdf}, url = {https://proceedings.mlr.press/v180/huang22a.html}, abstract = {Machine learning models have recently shown promise in predicting molecular quantum chemical properties. However, the path to real-life adoption requires (1) learning under low-resource constraints and (2) out-of-distribution generalization to unseen, structurally diverse molecules. We observe that these two challenges can be addressed via abundant labels, which is often not the case in quantum chemistry. We hypothesize that pseudo-labeling on a vast array of unlabeled molecules can serve as gold-label proxies to expand the training labeled dataset significantly. The challenge in pseudo-labeling is to prevent the bad pseudo-labels from biasing the model. Motivated by the entropy minimization framework, we develop a simple and effective strategy Pseudo that can assign pseudo-labels, detect bad pseudo-labels through evidential uncertainty, and prevent them from biasing the model using adaptive weighting. Empirically, Pseudo improves quantum calculations accuracy in full data, low data, and out-of-distribution settings. } }
Endnote
%0 Conference Paper %T Uncertainty-aware pseudo-labeling for quantum calculations %A Kexin Huang %A Vishnu Sresht %A Brajesh Rai %A Mykola Bordyuh %B Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2022 %E James Cussens %E Kun Zhang %F pmlr-v180-huang22a %I PMLR %P 853--862 %U https://proceedings.mlr.press/v180/huang22a.html %V 180 %X Machine learning models have recently shown promise in predicting molecular quantum chemical properties. However, the path to real-life adoption requires (1) learning under low-resource constraints and (2) out-of-distribution generalization to unseen, structurally diverse molecules. We observe that these two challenges can be addressed via abundant labels, which is often not the case in quantum chemistry. We hypothesize that pseudo-labeling on a vast array of unlabeled molecules can serve as gold-label proxies to expand the training labeled dataset significantly. The challenge in pseudo-labeling is to prevent the bad pseudo-labels from biasing the model. Motivated by the entropy minimization framework, we develop a simple and effective strategy Pseudo that can assign pseudo-labels, detect bad pseudo-labels through evidential uncertainty, and prevent them from biasing the model using adaptive weighting. Empirically, Pseudo improves quantum calculations accuracy in full data, low data, and out-of-distribution settings.
APA
Huang, K., Sresht, V., Rai, B. & Bordyuh, M.. (2022). Uncertainty-aware pseudo-labeling for quantum calculations. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 180:853-862 Available from https://proceedings.mlr.press/v180/huang22a.html.

Related Material