[edit]
Uncertainty Estimation for Single-cell Label Transfer
Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 179:109-128, 2022.
Abstract
Single-cell gene expression matrices require a cell type label for each cell for downstream analysis. A cell type label refers to a heterogeneous group to which a cell belongs. Machine learning algorithms that aim to automate the assignment of cell type labels train on reference datasets for which cell type labels are already defined. However, these methods are prone to error due to possible preprocessing errors and the dynamic nature of cellular states. Therefore, it is essential to measure the uncertainty associated with classifications. Here, we hypothesize that conformal prediction may provide a principled approach for this. We examine inductive conformal classifiers (ICPs) on the task of single-cell label transfer. ICPs lead to well-calibrated models that quantify uncertainties well. Results are motivating, and the uncertainties are intuitive and easy to interpret. We also consider a confidence-credibility conformal predictions setup that accurately predicts single labels with the desired error level. Such a model can also reject the classification of cell types unobserved in the reference dataset. However, the presence of unknown cell types violates the underlying assumption of a conformal predictor and is highly dependent on the quality of batch correction. We envision more work in detecting unknown cell types and using conformal predictions to evaluate batch correction methods.