Cross-Quality Few-Shot Transfer for Alloy Yield Strength Prediction: A New Materials Science Benchmark and A Sparsity-Oriented Optimization Framework

Xuxi Chen, Tianlong Chen, Everardo Yeriel Olivares, Kate Elder, Scott McCall, Aurelien Perron, Joseph McKeown, Bhavya Kailkhura, Zhangyang Wang, Brian Gallagher
Conference on Parsimony and Learning, PMLR 234:302-323, 2024.

Abstract

Discovering high-entropy alloys (HEAs) with high yield strength (YS) is crucial in materials science. However, the YS can only be accurately measured by expensive and time-consuming experiments, hence cannot be acquired at scale. Learning-based methods could facilitate the discovery, but the lack of a comprehensive dataset on HEA YS has created barriers. We present X-Yield, a materials science benchmark with 240 experimentally measured (high-quality) and over 100,000 simulated (low-quality) HEA YS data. Due to the scarcity of experimental results and the quality gap with simulated data, existing transfer learning methods cannot generalize well on our dataset. We address this cross-quality few-shot transfer problem by leveraging model sparsification "twice" — as a noise-robust feature regularizer at the pre-training stage, and as a data-efficient regularizer at the transfer stage. While the workflow already performs decently with sparsity patterns tuned independently for either stage, we propose a bi-level optimization framework termed Bi-RPT, that jointly learns optimal masks and allocates sparsity for both stages. The effectiveness of Bi-RPT is validated through experiments on X-Yield, alongside other testbeds. Specifically, we achieve a reduction of 8.9-19.8% in test MSE and a gain of 0.98-1.53% in test accuracy, using only 5-10% of the hard-to-generate real experimental data. The codes are available in https://github.com/VITA-Group/Bi-RPT.

Cite this Paper


BibTeX
@InProceedings{pmlr-v234-chen24a, title = {Cross-Quality Few-Shot Transfer for Alloy Yield Strength Prediction: A New Materials Science Benchmark and A Sparsity-Oriented Optimization Framework}, author = {Chen, Xuxi and Chen, Tianlong and Olivares, Everardo Yeriel and Elder, Kate and McCall, Scott and Perron, Aurelien and McKeown, Joseph and Kailkhura, Bhavya and Wang, Zhangyang and Gallagher, Brian}, booktitle = {Conference on Parsimony and Learning}, pages = {302--323}, year = {2024}, editor = {Chi, Yuejie and Dziugaite, Gintare Karolina and Qu, Qing and Wang, Atlas Wang and Zhu, Zhihui}, volume = {234}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v234/chen24a/chen24a.pdf}, url = {https://proceedings.mlr.press/v234/chen24a.html}, abstract = {Discovering high-entropy alloys (HEAs) with high yield strength (YS) is crucial in materials science. However, the YS can only be accurately measured by expensive and time-consuming experiments, hence cannot be acquired at scale. Learning-based methods could facilitate the discovery, but the lack of a comprehensive dataset on HEA YS has created barriers. We present X-Yield, a materials science benchmark with 240 experimentally measured (high-quality) and over 100,000 simulated (low-quality) HEA YS data. Due to the scarcity of experimental results and the quality gap with simulated data, existing transfer learning methods cannot generalize well on our dataset. We address this cross-quality few-shot transfer problem by leveraging model sparsification "twice" — as a noise-robust feature regularizer at the pre-training stage, and as a data-efficient regularizer at the transfer stage. While the workflow already performs decently with sparsity patterns tuned independently for either stage, we propose a bi-level optimization framework termed Bi-RPT, that jointly learns optimal masks and allocates sparsity for both stages. The effectiveness of Bi-RPT is validated through experiments on X-Yield, alongside other testbeds. Specifically, we achieve a reduction of 8.9-19.8% in test MSE and a gain of 0.98-1.53% in test accuracy, using only 5-10% of the hard-to-generate real experimental data. The codes are available in https://github.com/VITA-Group/Bi-RPT.} }
Endnote
%0 Conference Paper %T Cross-Quality Few-Shot Transfer for Alloy Yield Strength Prediction: A New Materials Science Benchmark and A Sparsity-Oriented Optimization Framework %A Xuxi Chen %A Tianlong Chen %A Everardo Yeriel Olivares %A Kate Elder %A Scott McCall %A Aurelien Perron %A Joseph McKeown %A Bhavya Kailkhura %A Zhangyang Wang %A Brian Gallagher %B Conference on Parsimony and Learning %C Proceedings of Machine Learning Research %D 2024 %E Yuejie Chi %E Gintare Karolina Dziugaite %E Qing Qu %E Atlas Wang Wang %E Zhihui Zhu %F pmlr-v234-chen24a %I PMLR %P 302--323 %U https://proceedings.mlr.press/v234/chen24a.html %V 234 %X Discovering high-entropy alloys (HEAs) with high yield strength (YS) is crucial in materials science. However, the YS can only be accurately measured by expensive and time-consuming experiments, hence cannot be acquired at scale. Learning-based methods could facilitate the discovery, but the lack of a comprehensive dataset on HEA YS has created barriers. We present X-Yield, a materials science benchmark with 240 experimentally measured (high-quality) and over 100,000 simulated (low-quality) HEA YS data. Due to the scarcity of experimental results and the quality gap with simulated data, existing transfer learning methods cannot generalize well on our dataset. We address this cross-quality few-shot transfer problem by leveraging model sparsification "twice" — as a noise-robust feature regularizer at the pre-training stage, and as a data-efficient regularizer at the transfer stage. While the workflow already performs decently with sparsity patterns tuned independently for either stage, we propose a bi-level optimization framework termed Bi-RPT, that jointly learns optimal masks and allocates sparsity for both stages. The effectiveness of Bi-RPT is validated through experiments on X-Yield, alongside other testbeds. Specifically, we achieve a reduction of 8.9-19.8% in test MSE and a gain of 0.98-1.53% in test accuracy, using only 5-10% of the hard-to-generate real experimental data. The codes are available in https://github.com/VITA-Group/Bi-RPT.
APA
Chen, X., Chen, T., Olivares, E.Y., Elder, K., McCall, S., Perron, A., McKeown, J., Kailkhura, B., Wang, Z. & Gallagher, B.. (2024). Cross-Quality Few-Shot Transfer for Alloy Yield Strength Prediction: A New Materials Science Benchmark and A Sparsity-Oriented Optimization Framework. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 234:302-323 Available from https://proceedings.mlr.press/v234/chen24a.html.

Related Material