Active Learning for Efficient Discovery of Optimal Combinatorial Perturbations

Jason Qin, Hans-Hermann Wessels, Carlos Fernandez-Granda, Yuhan Hao
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:50363-50383, 2025.

Abstract

Combinatorial CRISPR screening enables large-scale identification of synergistic gene pairs for combination therapies, but exhaustive experimentation is infeasible. We introduce NAIAD, an active learning framework that efficiently discovers optimal gene pairs by leveraging single-gene perturbation effects and adaptive gene embeddings that scale with the training data size, mitigating overfitting in small-sample learning while capturing complex gene interactions as more data is collected. Evaluated on four CRISPR datasets with over 350,000 interactions, NAIAD trained on small datasets outperforms existing models by up to 40%. Its recommendation system prioritizes gene pairs with maximum predicted effects, accelerating discovery with fewer experiments. We also extend NAIAD to optimal drug combination identification among 2,000 candidates. Overall, NAIAD enhances combinatorial perturbation design and drives advances in genomics research and therapeutic development in combination therapy. Our code is publicly available at: https://github.com/NeptuneBio/NAIAD

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-qin25g, title = {Active Learning for Efficient Discovery of Optimal Combinatorial Perturbations}, author = {Qin, Jason and Wessels, Hans-Hermann and Fernandez-Granda, Carlos and Hao, Yuhan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {50363--50383}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/qin25g/qin25g.pdf}, url = {https://proceedings.mlr.press/v267/qin25g.html}, abstract = {Combinatorial CRISPR screening enables large-scale identification of synergistic gene pairs for combination therapies, but exhaustive experimentation is infeasible. We introduce NAIAD, an active learning framework that efficiently discovers optimal gene pairs by leveraging single-gene perturbation effects and adaptive gene embeddings that scale with the training data size, mitigating overfitting in small-sample learning while capturing complex gene interactions as more data is collected. Evaluated on four CRISPR datasets with over 350,000 interactions, NAIAD trained on small datasets outperforms existing models by up to 40%. Its recommendation system prioritizes gene pairs with maximum predicted effects, accelerating discovery with fewer experiments. We also extend NAIAD to optimal drug combination identification among 2,000 candidates. Overall, NAIAD enhances combinatorial perturbation design and drives advances in genomics research and therapeutic development in combination therapy. Our code is publicly available at: https://github.com/NeptuneBio/NAIAD} }
Endnote
%0 Conference Paper %T Active Learning for Efficient Discovery of Optimal Combinatorial Perturbations %A Jason Qin %A Hans-Hermann Wessels %A Carlos Fernandez-Granda %A Yuhan Hao %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-qin25g %I PMLR %P 50363--50383 %U https://proceedings.mlr.press/v267/qin25g.html %V 267 %X Combinatorial CRISPR screening enables large-scale identification of synergistic gene pairs for combination therapies, but exhaustive experimentation is infeasible. We introduce NAIAD, an active learning framework that efficiently discovers optimal gene pairs by leveraging single-gene perturbation effects and adaptive gene embeddings that scale with the training data size, mitigating overfitting in small-sample learning while capturing complex gene interactions as more data is collected. Evaluated on four CRISPR datasets with over 350,000 interactions, NAIAD trained on small datasets outperforms existing models by up to 40%. Its recommendation system prioritizes gene pairs with maximum predicted effects, accelerating discovery with fewer experiments. We also extend NAIAD to optimal drug combination identification among 2,000 candidates. Overall, NAIAD enhances combinatorial perturbation design and drives advances in genomics research and therapeutic development in combination therapy. Our code is publicly available at: https://github.com/NeptuneBio/NAIAD
APA
Qin, J., Wessels, H., Fernandez-Granda, C. & Hao, Y.. (2025). Active Learning for Efficient Discovery of Optimal Combinatorial Perturbations. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:50363-50383 Available from https://proceedings.mlr.press/v267/qin25g.html.

Related Material