Multimodal single cell data integration challenge: Results and lessons learned

Christopher Lance, Malte D. Luecken, Daniel B. Burkhardt, Robrecht Cannoodt, Pia Rautenstrauch, Anna Laddach, Aidyn Ubingazhibov, Zhi-Jie Cao, Kaiwen Deng, Sumeer Khan, Qiao Liu, Nikolay Russkikh, Gleb Ryazantsev, Uwe Ohler, NeurIPS 2021 Multimodal data integration competition participants, Angela Oliveira Pisco, Jonathan Bloom, Smita Krishnaswamy, Fabian J. Theis
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, PMLR 176:162-176, 2022.

Abstract

Biology has become a data-intensive science. Recent technological advances in single-cell genomics have enabled the measurement of multiple facets of cellular state, producing datasets with millions of single-cell observations. While these data hold great promise for understanding molecular mechanisms in health and disease, analysis challenges arising from sparsity, technical and biological variability, and high dimensionality of the data hinder the derivation of such mechanistic insights. To promote the innovation of algorithms for analysis of multimodal single-cell data, we organized a competition at NeurIPS 2021 applying the Common Task Framework to multimodal single-cell data integration. For this competition we generated the first multimodal benchmarking dataset for single-cell biology and defined three tasks in this domain: prediction of missing modalities, aligning modalities, and learning a joint representation across modalities. We further specified evaluation metrics and developed a cloud-based algorithm evaluation pipeline. Using this setup, 280 competitors submitted over 2600 proposed solutions within a 3 month period, showcasing substantial innovation especially in the modality alignment task. Here, we present the results, describe trends of well performing approaches, and discuss challenges associated with running the competition.

Cite this Paper


BibTeX
@InProceedings{pmlr-v176-lance22a, title = {Multimodal single cell data integration challenge: Results and lessons learned}, author = {Lance, Christopher and Luecken, Malte D. and Burkhardt, Daniel B. and Cannoodt, Robrecht and Rautenstrauch, Pia and Laddach, Anna and Ubingazhibov, Aidyn and Cao, Zhi-Jie and Deng, Kaiwen and Khan, Sumeer and Liu, Qiao and Russkikh, Nikolay and Ryazantsev, Gleb and Ohler, Uwe and data integration competition participants, NeurIPS 2021 Multimodal and Pisco, Angela Oliveira and Bloom, Jonathan and Krishnaswamy, Smita and Theis, Fabian J.}, booktitle = {Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track}, pages = {162--176}, year = {2022}, editor = {Kiela, Douwe and Ciccone, Marco and Caputo, Barbara}, volume = {176}, series = {Proceedings of Machine Learning Research}, month = {06--14 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v176/lance22a/lance22a.pdf}, url = {https://proceedings.mlr.press/v176/lance22a.html}, abstract = {Biology has become a data-intensive science. Recent technological advances in single-cell genomics have enabled the measurement of multiple facets of cellular state, producing datasets with millions of single-cell observations. While these data hold great promise for understanding molecular mechanisms in health and disease, analysis challenges arising from sparsity, technical and biological variability, and high dimensionality of the data hinder the derivation of such mechanistic insights. To promote the innovation of algorithms for analysis of multimodal single-cell data, we organized a competition at NeurIPS 2021 applying the Common Task Framework to multimodal single-cell data integration. For this competition we generated the first multimodal benchmarking dataset for single-cell biology and defined three tasks in this domain: prediction of missing modalities, aligning modalities, and learning a joint representation across modalities. We further specified evaluation metrics and developed a cloud-based algorithm evaluation pipeline. Using this setup, 280 competitors submitted over 2600 proposed solutions within a 3 month period, showcasing substantial innovation especially in the modality alignment task. Here, we present the results, describe trends of well performing approaches, and discuss challenges associated with running the competition.} }
Endnote
%0 Conference Paper %T Multimodal single cell data integration challenge: Results and lessons learned %A Christopher Lance %A Malte D. Luecken %A Daniel B. Burkhardt %A Robrecht Cannoodt %A Pia Rautenstrauch %A Anna Laddach %A Aidyn Ubingazhibov %A Zhi-Jie Cao %A Kaiwen Deng %A Sumeer Khan %A Qiao Liu %A Nikolay Russkikh %A Gleb Ryazantsev %A Uwe Ohler %A NeurIPS 2021 Multimodal data integration competition participants %A Angela Oliveira Pisco %A Jonathan Bloom %A Smita Krishnaswamy %A Fabian J. Theis %B Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track %C Proceedings of Machine Learning Research %D 2022 %E Douwe Kiela %E Marco Ciccone %E Barbara Caputo %F pmlr-v176-lance22a %I PMLR %P 162--176 %U https://proceedings.mlr.press/v176/lance22a.html %V 176 %X Biology has become a data-intensive science. Recent technological advances in single-cell genomics have enabled the measurement of multiple facets of cellular state, producing datasets with millions of single-cell observations. While these data hold great promise for understanding molecular mechanisms in health and disease, analysis challenges arising from sparsity, technical and biological variability, and high dimensionality of the data hinder the derivation of such mechanistic insights. To promote the innovation of algorithms for analysis of multimodal single-cell data, we organized a competition at NeurIPS 2021 applying the Common Task Framework to multimodal single-cell data integration. For this competition we generated the first multimodal benchmarking dataset for single-cell biology and defined three tasks in this domain: prediction of missing modalities, aligning modalities, and learning a joint representation across modalities. We further specified evaluation metrics and developed a cloud-based algorithm evaluation pipeline. Using this setup, 280 competitors submitted over 2600 proposed solutions within a 3 month period, showcasing substantial innovation especially in the modality alignment task. Here, we present the results, describe trends of well performing approaches, and discuss challenges associated with running the competition.
APA
Lance, C., Luecken, M.D., Burkhardt, D.B., Cannoodt, R., Rautenstrauch, P., Laddach, A., Ubingazhibov, A., Cao, Z., Deng, K., Khan, S., Liu, Q., Russkikh, N., Ryazantsev, G., Ohler, U., data integration competition participants, N.2.M., Pisco, A.O., Bloom, J., Krishnaswamy, S. & Theis, F.J.. (2022). Multimodal single cell data integration challenge: Results and lessons learned. Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, in Proceedings of Machine Learning Research 176:162-176 Available from https://proceedings.mlr.press/v176/lance22a.html.

Related Material