Multi-View Independent Component Analysis with Shared and Individual Sources

Teodora Pandeva, Patrick Forré
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1639-1650, 2023.

Abstract

Independent component analysis (ICA) is a blind source separation method for linear disentanglement of independent latent sources from observed data. We investigate the special setting of noisy linear ICA, referred to as ShIndICA, where the observations are split among different views, each receiving a mixture of shared and individual sources. We prove that the corresponding linear structure is identifiable and the sources distribution can be recovered. To computationally estimate the sources, we optimize a constrained form of the joint log-likelihood of the observed data among all views. Furthermore, we propose a model selection procedure for recovering the number of shared sources. Finally, we empirically demonstrate the advantages of our model over baselines. We apply ShIndICA in a challenging real-life task, using three transcriptome datasets provided by three different labs (three different views). The recovered sources were used for a downstream graph inference task, facilitating the discovery of a plausible representation of the data’s underlying graph structure.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-pandeva23a, title = {Multi-View Independent Component Analysis with Shared and Individual Sources}, author = {Pandeva, Teodora and Forr\'e, Patrick}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {1639--1650}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/pandeva23a/pandeva23a.pdf}, url = {https://proceedings.mlr.press/v216/pandeva23a.html}, abstract = {Independent component analysis (ICA) is a blind source separation method for linear disentanglement of independent latent sources from observed data. We investigate the special setting of noisy linear ICA, referred to as ShIndICA, where the observations are split among different views, each receiving a mixture of shared and individual sources. We prove that the corresponding linear structure is identifiable and the sources distribution can be recovered. To computationally estimate the sources, we optimize a constrained form of the joint log-likelihood of the observed data among all views. Furthermore, we propose a model selection procedure for recovering the number of shared sources. Finally, we empirically demonstrate the advantages of our model over baselines. We apply ShIndICA in a challenging real-life task, using three transcriptome datasets provided by three different labs (three different views). The recovered sources were used for a downstream graph inference task, facilitating the discovery of a plausible representation of the data’s underlying graph structure.} }
Endnote
%0 Conference Paper %T Multi-View Independent Component Analysis with Shared and Individual Sources %A Teodora Pandeva %A Patrick Forré %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-pandeva23a %I PMLR %P 1639--1650 %U https://proceedings.mlr.press/v216/pandeva23a.html %V 216 %X Independent component analysis (ICA) is a blind source separation method for linear disentanglement of independent latent sources from observed data. We investigate the special setting of noisy linear ICA, referred to as ShIndICA, where the observations are split among different views, each receiving a mixture of shared and individual sources. We prove that the corresponding linear structure is identifiable and the sources distribution can be recovered. To computationally estimate the sources, we optimize a constrained form of the joint log-likelihood of the observed data among all views. Furthermore, we propose a model selection procedure for recovering the number of shared sources. Finally, we empirically demonstrate the advantages of our model over baselines. We apply ShIndICA in a challenging real-life task, using three transcriptome datasets provided by three different labs (three different views). The recovered sources were used for a downstream graph inference task, facilitating the discovery of a plausible representation of the data’s underlying graph structure.
APA
Pandeva, T. & Forré, P.. (2023). Multi-View Independent Component Analysis with Shared and Individual Sources. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1639-1650 Available from https://proceedings.mlr.press/v216/pandeva23a.html.

Related Material