Uncertainty Estimation for Single-cell Label Transfer

Robin Khatri, Stefan Bonn
Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 179:109-128, 2022.

Abstract

Single-cell gene expression matrices require a cell type label for each cell for downstream analysis. A cell type label refers to a heterogeneous group to which a cell belongs. Machine learning algorithms that aim to automate the assignment of cell type labels train on reference datasets for which cell type labels are already defined. However, these methods are prone to error due to possible preprocessing errors and the dynamic nature of cellular states. Therefore, it is essential to measure the uncertainty associated with classifications. Here, we hypothesize that conformal prediction may provide a principled approach for this. We examine inductive conformal classifiers (ICPs) on the task of single-cell label transfer. ICPs lead to well-calibrated models that quantify uncertainties well. Results are motivating, and the uncertainties are intuitive and easy to interpret. We also consider a confidence-credibility conformal predictions setup that accurately predicts single labels with the desired error level. Such a model can also reject the classification of cell types unobserved in the reference dataset. However, the presence of unknown cell types violates the underlying assumption of a conformal predictor and is highly dependent on the quality of batch correction. We envision more work in detecting unknown cell types and using conformal predictions to evaluate batch correction methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v179-khatri22a, title = {Uncertainty Estimation for Single-cell Label Transfer}, author = {Khatri, Robin and Bonn, Stefan}, booktitle = {Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications}, pages = {109--128}, year = {2022}, editor = {Johansson, Ulf and Boström, Henrik and An Nguyen, Khuong and Luo, Zhiyuan and Carlsson, Lars}, volume = {179}, series = {Proceedings of Machine Learning Research}, month = {24--26 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v179/khatri22a/khatri22a.pdf}, url = {https://proceedings.mlr.press/v179/khatri22a.html}, abstract = {Single-cell gene expression matrices require a cell type label for each cell for downstream analysis. A cell type label refers to a heterogeneous group to which a cell belongs. Machine learning algorithms that aim to automate the assignment of cell type labels train on reference datasets for which cell type labels are already defined. However, these methods are prone to error due to possible preprocessing errors and the dynamic nature of cellular states. Therefore, it is essential to measure the uncertainty associated with classifications. Here, we hypothesize that conformal prediction may provide a principled approach for this. We examine inductive conformal classifiers (ICPs) on the task of single-cell label transfer. ICPs lead to well-calibrated models that quantify uncertainties well. Results are motivating, and the uncertainties are intuitive and easy to interpret. We also consider a confidence-credibility conformal predictions setup that accurately predicts single labels with the desired error level. Such a model can also reject the classification of cell types unobserved in the reference dataset. However, the presence of unknown cell types violates the underlying assumption of a conformal predictor and is highly dependent on the quality of batch correction. We envision more work in detecting unknown cell types and using conformal predictions to evaluate batch correction methods. } }
Endnote
%0 Conference Paper %T Uncertainty Estimation for Single-cell Label Transfer %A Robin Khatri %A Stefan Bonn %B Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications %C Proceedings of Machine Learning Research %D 2022 %E Ulf Johansson %E Henrik Boström %E Khuong An Nguyen %E Zhiyuan Luo %E Lars Carlsson %F pmlr-v179-khatri22a %I PMLR %P 109--128 %U https://proceedings.mlr.press/v179/khatri22a.html %V 179 %X Single-cell gene expression matrices require a cell type label for each cell for downstream analysis. A cell type label refers to a heterogeneous group to which a cell belongs. Machine learning algorithms that aim to automate the assignment of cell type labels train on reference datasets for which cell type labels are already defined. However, these methods are prone to error due to possible preprocessing errors and the dynamic nature of cellular states. Therefore, it is essential to measure the uncertainty associated with classifications. Here, we hypothesize that conformal prediction may provide a principled approach for this. We examine inductive conformal classifiers (ICPs) on the task of single-cell label transfer. ICPs lead to well-calibrated models that quantify uncertainties well. Results are motivating, and the uncertainties are intuitive and easy to interpret. We also consider a confidence-credibility conformal predictions setup that accurately predicts single labels with the desired error level. Such a model can also reject the classification of cell types unobserved in the reference dataset. However, the presence of unknown cell types violates the underlying assumption of a conformal predictor and is highly dependent on the quality of batch correction. We envision more work in detecting unknown cell types and using conformal predictions to evaluate batch correction methods.
APA
Khatri, R. & Bonn, S.. (2022). Uncertainty Estimation for Single-cell Label Transfer. Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications, in Proceedings of Machine Learning Research 179:109-128 Available from https://proceedings.mlr.press/v179/khatri22a.html.

Related Material