Multivariate Conformal Selection

Tian Bai, Yue Zhao, Xiang Yu, Archer Y. Yang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:2535-2559, 2025.

Abstract

Selecting high-quality candidates from large datasets is critical in applications such as drug discovery, precision medicine, and alignment of large language models (LLMs). While Conformal Selection (CS) provides rigorous uncertainty quantification, it is limited to univariate responses and scalar criteria. To address this, we propose Multivariate Conformal Selection (mCS), a generalization of CS designed for multivariate response settings. Our method introduces regional monotonicity and employs multivariate nonconformity scores to construct conformal $p$-values, enabling finite-sample False Discovery Rate (FDR) control. We present two variants: $\texttt{mCS-dist}$, using distance-based scores, and $\texttt{mCS-learn}$, which learns optimal scores via differentiable optimization. Experiments on simulated and real-world datasets demonstrate that mCS significantly improves selection power while maintaining FDR control, establishing it as a robust framework for multivariate selection tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-bai25d, title = {Multivariate Conformal Selection}, author = {Bai, Tian and Zhao, Yue and Yu, Xiang and Yang, Archer Y.}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {2535--2559}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/bai25d/bai25d.pdf}, url = {https://proceedings.mlr.press/v267/bai25d.html}, abstract = {Selecting high-quality candidates from large datasets is critical in applications such as drug discovery, precision medicine, and alignment of large language models (LLMs). While Conformal Selection (CS) provides rigorous uncertainty quantification, it is limited to univariate responses and scalar criteria. To address this, we propose Multivariate Conformal Selection (mCS), a generalization of CS designed for multivariate response settings. Our method introduces regional monotonicity and employs multivariate nonconformity scores to construct conformal $p$-values, enabling finite-sample False Discovery Rate (FDR) control. We present two variants: $\texttt{mCS-dist}$, using distance-based scores, and $\texttt{mCS-learn}$, which learns optimal scores via differentiable optimization. Experiments on simulated and real-world datasets demonstrate that mCS significantly improves selection power while maintaining FDR control, establishing it as a robust framework for multivariate selection tasks.} }
Endnote
%0 Conference Paper %T Multivariate Conformal Selection %A Tian Bai %A Yue Zhao %A Xiang Yu %A Archer Y. Yang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-bai25d %I PMLR %P 2535--2559 %U https://proceedings.mlr.press/v267/bai25d.html %V 267 %X Selecting high-quality candidates from large datasets is critical in applications such as drug discovery, precision medicine, and alignment of large language models (LLMs). While Conformal Selection (CS) provides rigorous uncertainty quantification, it is limited to univariate responses and scalar criteria. To address this, we propose Multivariate Conformal Selection (mCS), a generalization of CS designed for multivariate response settings. Our method introduces regional monotonicity and employs multivariate nonconformity scores to construct conformal $p$-values, enabling finite-sample False Discovery Rate (FDR) control. We present two variants: $\texttt{mCS-dist}$, using distance-based scores, and $\texttt{mCS-learn}$, which learns optimal scores via differentiable optimization. Experiments on simulated and real-world datasets demonstrate that mCS significantly improves selection power while maintaining FDR control, establishing it as a robust framework for multivariate selection tasks.
APA
Bai, T., Zhao, Y., Yu, X. & Yang, A.Y.. (2025). Multivariate Conformal Selection. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:2535-2559 Available from https://proceedings.mlr.press/v267/bai25d.html.

Related Material