Bayesian Optimization for Crop Genetics with Scalable Probabilistic Models

Ruhana Azam, Sang T. Truong, Samuel B. Fernandes, Andrew D.B. Leakey, Alexander Lipka, Mohammed El-Kebir, Sanmi Koyejo
Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference, PMLR 253:30-44, 2024.

Abstract

An overarching goal of crop improvement is to select plants with desirable traits so that crops can provide sufficient food and nutrients for humanity in the face of climate change. To achieve such a goal, crop breeders utilize genomic prediction, in which that genome-wide DNA marker information is used to predict breeding values for desirable traits . Genomic prediction is complemented by advancements in high-throughput phenotyping, in robots and drones collect orders of magnitude higher amounts of trait information than in the past. Although such data are abundant and easy to collect, identifying the most biologically meaningful traits for use in genomic prediction is expensive. Bayesian optimization (BO) is a strong cost-effective solution to identify such meaningful traits. In this work, we quantified the performance of BO with a collection of acquisition function and surrogate models for identifying good proxies, in a set of +4 million proxies. We found that BO achieves comparable sample efficiency to random search while requiring significantly less computation. Despite traditional BO and random search techniques performing sufficiently well, both search techniques fail to leverage information from related tasks. To this end, we propose a pre-trained model as a transfer learning method. Using this benchmark, we conduct an extensive empirical study and demonstrate promising results on the transfer learning effect, highlighting a core design principle for developing more parsimonious optimization algorithms for crop improvement.

Cite this Paper


BibTeX
@InProceedings{pmlr-v253-azam24a, title = {Bayesian Optimization for Crop Genetics with Scalable Probabilistic Models}, author = {Azam, Ruhana and Truong, Sang T. and Fernandes, Samuel B. and Leakey, Andrew D.B. and Lipka, Alexander and El-Kebir, Mohammed and Koyejo, Sanmi}, booktitle = {Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference}, pages = {30--44}, year = {2024}, editor = {AntorĂ¡n, Javier and Naesseth, Christian A.}, volume = {253}, series = {Proceedings of Machine Learning Research}, month = {21 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v253/main/assets/azam24a/azam24a.pdf}, url = {https://proceedings.mlr.press/v253/azam24a.html}, abstract = {An overarching goal of crop improvement is to select plants with desirable traits so that crops can provide sufficient food and nutrients for humanity in the face of climate change. To achieve such a goal, crop breeders utilize genomic prediction, in which that genome-wide DNA marker information is used to predict breeding values for desirable traits . Genomic prediction is complemented by advancements in high-throughput phenotyping, in robots and drones collect orders of magnitude higher amounts of trait information than in the past. Although such data are abundant and easy to collect, identifying the most biologically meaningful traits for use in genomic prediction is expensive. Bayesian optimization (BO) is a strong cost-effective solution to identify such meaningful traits. In this work, we quantified the performance of BO with a collection of acquisition function and surrogate models for identifying good proxies, in a set of +4 million proxies. We found that BO achieves comparable sample efficiency to random search while requiring significantly less computation. Despite traditional BO and random search techniques performing sufficiently well, both search techniques fail to leverage information from related tasks. To this end, we propose a pre-trained model as a transfer learning method. Using this benchmark, we conduct an extensive empirical study and demonstrate promising results on the transfer learning effect, highlighting a core design principle for developing more parsimonious optimization algorithms for crop improvement.} }
Endnote
%0 Conference Paper %T Bayesian Optimization for Crop Genetics with Scalable Probabilistic Models %A Ruhana Azam %A Sang T. Truong %A Samuel B. Fernandes %A Andrew D.B. Leakey %A Alexander Lipka %A Mohammed El-Kebir %A Sanmi Koyejo %B Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference %C Proceedings of Machine Learning Research %D 2024 %E Javier AntorĂ¡n %E Christian A. Naesseth %F pmlr-v253-azam24a %I PMLR %P 30--44 %U https://proceedings.mlr.press/v253/azam24a.html %V 253 %X An overarching goal of crop improvement is to select plants with desirable traits so that crops can provide sufficient food and nutrients for humanity in the face of climate change. To achieve such a goal, crop breeders utilize genomic prediction, in which that genome-wide DNA marker information is used to predict breeding values for desirable traits . Genomic prediction is complemented by advancements in high-throughput phenotyping, in robots and drones collect orders of magnitude higher amounts of trait information than in the past. Although such data are abundant and easy to collect, identifying the most biologically meaningful traits for use in genomic prediction is expensive. Bayesian optimization (BO) is a strong cost-effective solution to identify such meaningful traits. In this work, we quantified the performance of BO with a collection of acquisition function and surrogate models for identifying good proxies, in a set of +4 million proxies. We found that BO achieves comparable sample efficiency to random search while requiring significantly less computation. Despite traditional BO and random search techniques performing sufficiently well, both search techniques fail to leverage information from related tasks. To this end, we propose a pre-trained model as a transfer learning method. Using this benchmark, we conduct an extensive empirical study and demonstrate promising results on the transfer learning effect, highlighting a core design principle for developing more parsimonious optimization algorithms for crop improvement.
APA
Azam, R., Truong, S.T., Fernandes, S.B., Leakey, A.D., Lipka, A., El-Kebir, M. & Koyejo, S.. (2024). Bayesian Optimization for Crop Genetics with Scalable Probabilistic Models. Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference, in Proceedings of Machine Learning Research 253:30-44 Available from https://proceedings.mlr.press/v253/azam24a.html.

Related Material