Disentangling shared and group-specific variations in single-cell transcriptomics data with multiGroupVI

Ethan Weinberger, Romain Lopez, Jan-Christian Huetter, Aviv Regev
Proceedings of the 17th Machine Learning in Computational Biology meeting, PMLR 200:16-32, 2022.

Abstract

Single-cell RNA sequencing (scRNA-seq) technologies have enabled a greater understanding of previously unexplored biological diversity. By design of such experiments, individual cells from scRNA-seq datasets can often be attributed to non-overlapping “groups”. For example, these group labels may denote the cell’s tissue or cell line of origin. In this setting, one important problem consists in discerning patterns in the data that are shared across groups versus those that are group-specific. However, existing methods for this type of analysis are mainly limited to (generalized) linear latent variable models. Here we introduce multiGroupVI, a deep generative model for analyzing grouped scRNA-seq datasets that decomposes the data into shared and group-specific factors of variation. We first validate our approach on a simulated dataset, on which we significantly outperform state-of-the-art methods. We then apply it to explore regional differences in an scRNA-seq dataset sampled from multiple regions of the mouse small intestine. We implemented multiGroupVI using the scvi-tools library, and released it as open-source software at www.placeholder.com.

Cite this Paper


BibTeX
@InProceedings{pmlr-v200-weinberger22a, title = {Disentangling shared and group-specific variations in single-cell transcriptomics data with multiGroupVI}, author = {Weinberger, Ethan and Lopez, Romain and Huetter, Jan-Christian and Regev, Aviv}, booktitle = {Proceedings of the 17th Machine Learning in Computational Biology meeting}, pages = {16--32}, year = {2022}, editor = {Knowles, David A and Mostafavi, Sara and Lee, Su-In}, volume = {200}, series = {Proceedings of Machine Learning Research}, month = {21--22 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v200/weinberger22a/weinberger22a.pdf}, url = {https://proceedings.mlr.press/v200/weinberger22a.html}, abstract = {Single-cell RNA sequencing (scRNA-seq) technologies have enabled a greater understanding of previously unexplored biological diversity. By design of such experiments, individual cells from scRNA-seq datasets can often be attributed to non-overlapping “groups”. For example, these group labels may denote the cell’s tissue or cell line of origin. In this setting, one important problem consists in discerning patterns in the data that are shared across groups versus those that are group-specific. However, existing methods for this type of analysis are mainly limited to (generalized) linear latent variable models. Here we introduce multiGroupVI, a deep generative model for analyzing grouped scRNA-seq datasets that decomposes the data into shared and group-specific factors of variation. We first validate our approach on a simulated dataset, on which we significantly outperform state-of-the-art methods. We then apply it to explore regional differences in an scRNA-seq dataset sampled from multiple regions of the mouse small intestine. We implemented multiGroupVI using the scvi-tools library, and released it as open-source software at www.placeholder.com.} }
Endnote
%0 Conference Paper %T Disentangling shared and group-specific variations in single-cell transcriptomics data with multiGroupVI %A Ethan Weinberger %A Romain Lopez %A Jan-Christian Huetter %A Aviv Regev %B Proceedings of the 17th Machine Learning in Computational Biology meeting %C Proceedings of Machine Learning Research %D 2022 %E David A Knowles %E Sara Mostafavi %E Su-In Lee %F pmlr-v200-weinberger22a %I PMLR %P 16--32 %U https://proceedings.mlr.press/v200/weinberger22a.html %V 200 %X Single-cell RNA sequencing (scRNA-seq) technologies have enabled a greater understanding of previously unexplored biological diversity. By design of such experiments, individual cells from scRNA-seq datasets can often be attributed to non-overlapping “groups”. For example, these group labels may denote the cell’s tissue or cell line of origin. In this setting, one important problem consists in discerning patterns in the data that are shared across groups versus those that are group-specific. However, existing methods for this type of analysis are mainly limited to (generalized) linear latent variable models. Here we introduce multiGroupVI, a deep generative model for analyzing grouped scRNA-seq datasets that decomposes the data into shared and group-specific factors of variation. We first validate our approach on a simulated dataset, on which we significantly outperform state-of-the-art methods. We then apply it to explore regional differences in an scRNA-seq dataset sampled from multiple regions of the mouse small intestine. We implemented multiGroupVI using the scvi-tools library, and released it as open-source software at www.placeholder.com.
APA
Weinberger, E., Lopez, R., Huetter, J. & Regev, A.. (2022). Disentangling shared and group-specific variations in single-cell transcriptomics data with multiGroupVI. Proceedings of the 17th Machine Learning in Computational Biology meeting, in Proceedings of Machine Learning Research 200:16-32 Available from https://proceedings.mlr.press/v200/weinberger22a.html.

Related Material