Energy-based Modelling for Single-cell Data Annotation

Tianyi Liu, Philip Fradkin, Lazar Atanackovic, Leo J Lee
Proceedings of the 17th Machine Learning in Computational Biology meeting, PMLR 200:94-109, 2022.

Abstract

Single-cell sequencing has provided profound insights into understanding heterogeneous cellular activities by measuring sequence information at the individual cell resolution. Accurately annotating a single-cell RNA sequencing (scRNA-seq) dataset is a crucial step for the single-cell data analysis pipeline. In particular, previously unobserved cell types and cellular states frequently appear in scRNA-seq experiments and carry valuable information. This highlights the need for reliable annotation tools with out-of-distribution (OOD) detection capability. Recent advances in energy-based modelling have made it possible to design and deploy EBMs for joint discriminative and generative tasks. In this work, we introduced energy-based models (EBMs) for scRNA-seq annotation and investigated generative modelling for OOD detection, which result in more accurate, calibrated, and robust cell-type predictions. Specifically, we developed CLAMS, an EBM instance improved upon the previous joint energy-based model (JEM), for single-cell data hybrid modelling. Our experiments reveal that hybrid modelling with EBMs maintains the strong discriminative power of baseline classifiers and outperforms the state-of-the-art by integrating generative capabilities in data annotation and OOD detection tasks. To the best of our knowledge, we are the first to apply EBMs for single-cell data modelling.

Cite this Paper


BibTeX
@InProceedings{pmlr-v200-liu22b, title = {Energy-based Modelling for Single-cell Data Annotation}, author = {Liu, Tianyi and Fradkin, Philip and Atanackovic, Lazar and Lee, Leo J}, booktitle = {Proceedings of the 17th Machine Learning in Computational Biology meeting}, pages = {94--109}, year = {2022}, editor = {Knowles, David A and Mostafavi, Sara and Lee, Su-In}, volume = {200}, series = {Proceedings of Machine Learning Research}, month = {21--22 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v200/liu22b/liu22b.pdf}, url = {https://proceedings.mlr.press/v200/liu22b.html}, abstract = {Single-cell sequencing has provided profound insights into understanding heterogeneous cellular activities by measuring sequence information at the individual cell resolution. Accurately annotating a single-cell RNA sequencing (scRNA-seq) dataset is a crucial step for the single-cell data analysis pipeline. In particular, previously unobserved cell types and cellular states frequently appear in scRNA-seq experiments and carry valuable information. This highlights the need for reliable annotation tools with out-of-distribution (OOD) detection capability. Recent advances in energy-based modelling have made it possible to design and deploy EBMs for joint discriminative and generative tasks. In this work, we introduced energy-based models (EBMs) for scRNA-seq annotation and investigated generative modelling for OOD detection, which result in more accurate, calibrated, and robust cell-type predictions. Specifically, we developed CLAMS, an EBM instance improved upon the previous joint energy-based model (JEM), for single-cell data hybrid modelling. Our experiments reveal that hybrid modelling with EBMs maintains the strong discriminative power of baseline classifiers and outperforms the state-of-the-art by integrating generative capabilities in data annotation and OOD detection tasks. To the best of our knowledge, we are the first to apply EBMs for single-cell data modelling.} }
Endnote
%0 Conference Paper %T Energy-based Modelling for Single-cell Data Annotation %A Tianyi Liu %A Philip Fradkin %A Lazar Atanackovic %A Leo J Lee %B Proceedings of the 17th Machine Learning in Computational Biology meeting %C Proceedings of Machine Learning Research %D 2022 %E David A Knowles %E Sara Mostafavi %E Su-In Lee %F pmlr-v200-liu22b %I PMLR %P 94--109 %U https://proceedings.mlr.press/v200/liu22b.html %V 200 %X Single-cell sequencing has provided profound insights into understanding heterogeneous cellular activities by measuring sequence information at the individual cell resolution. Accurately annotating a single-cell RNA sequencing (scRNA-seq) dataset is a crucial step for the single-cell data analysis pipeline. In particular, previously unobserved cell types and cellular states frequently appear in scRNA-seq experiments and carry valuable information. This highlights the need for reliable annotation tools with out-of-distribution (OOD) detection capability. Recent advances in energy-based modelling have made it possible to design and deploy EBMs for joint discriminative and generative tasks. In this work, we introduced energy-based models (EBMs) for scRNA-seq annotation and investigated generative modelling for OOD detection, which result in more accurate, calibrated, and robust cell-type predictions. Specifically, we developed CLAMS, an EBM instance improved upon the previous joint energy-based model (JEM), for single-cell data hybrid modelling. Our experiments reveal that hybrid modelling with EBMs maintains the strong discriminative power of baseline classifiers and outperforms the state-of-the-art by integrating generative capabilities in data annotation and OOD detection tasks. To the best of our knowledge, we are the first to apply EBMs for single-cell data modelling.
APA
Liu, T., Fradkin, P., Atanackovic, L. & Lee, L.J.. (2022). Energy-based Modelling for Single-cell Data Annotation. Proceedings of the 17th Machine Learning in Computational Biology meeting, in Proceedings of Machine Learning Research 200:94-109 Available from https://proceedings.mlr.press/v200/liu22b.html.

Related Material