Theoretical Analysis of Sparse Subspace Clustering with Missing Entries

Manolis Tsakiris, Rene Vidal
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4975-4984, 2018.

Abstract

Sparse Subspace Clustering (SSC) is a popular unsupervised machine learning method for clustering data lying close to an unknown union of low-dimensional linear subspaces; a problem with numerous applications in pattern recognition and computer vision. Even though the behavior of SSC for complete data is by now well-understood, little is known about its theoretical properties when applied to data with missing entries. In this paper we give theoretical guarantees for SSC with incomplete data, and provide theoretical evidence that projecting the zero-filled data onto the observation pattern of the point being expressed can lead to substantial improvement in performance; a phenomenon already known experimentally. The main insight of our analysis is that even though this projection induces additional missing entries, this is counterbalanced by the fact that the projected and zero-filled data are in effect incomplete points associated with the union of the corresponding projected subspaces, with respect to which the point being expressed is complete. The significance of this phenomenon potentially extends to the entire class of self-expressive methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-tsakiris18a, title = {Theoretical Analysis of Sparse Subspace Clustering with Missing Entries}, author = {Tsakiris, Manolis and Vidal, Rene}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {4975--4984}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/tsakiris18a/tsakiris18a.pdf}, url = {https://proceedings.mlr.press/v80/tsakiris18a.html}, abstract = {Sparse Subspace Clustering (SSC) is a popular unsupervised machine learning method for clustering data lying close to an unknown union of low-dimensional linear subspaces; a problem with numerous applications in pattern recognition and computer vision. Even though the behavior of SSC for complete data is by now well-understood, little is known about its theoretical properties when applied to data with missing entries. In this paper we give theoretical guarantees for SSC with incomplete data, and provide theoretical evidence that projecting the zero-filled data onto the observation pattern of the point being expressed can lead to substantial improvement in performance; a phenomenon already known experimentally. The main insight of our analysis is that even though this projection induces additional missing entries, this is counterbalanced by the fact that the projected and zero-filled data are in effect incomplete points associated with the union of the corresponding projected subspaces, with respect to which the point being expressed is complete. The significance of this phenomenon potentially extends to the entire class of self-expressive methods.} }
Endnote
%0 Conference Paper %T Theoretical Analysis of Sparse Subspace Clustering with Missing Entries %A Manolis Tsakiris %A Rene Vidal %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-tsakiris18a %I PMLR %P 4975--4984 %U https://proceedings.mlr.press/v80/tsakiris18a.html %V 80 %X Sparse Subspace Clustering (SSC) is a popular unsupervised machine learning method for clustering data lying close to an unknown union of low-dimensional linear subspaces; a problem with numerous applications in pattern recognition and computer vision. Even though the behavior of SSC for complete data is by now well-understood, little is known about its theoretical properties when applied to data with missing entries. In this paper we give theoretical guarantees for SSC with incomplete data, and provide theoretical evidence that projecting the zero-filled data onto the observation pattern of the point being expressed can lead to substantial improvement in performance; a phenomenon already known experimentally. The main insight of our analysis is that even though this projection induces additional missing entries, this is counterbalanced by the fact that the projected and zero-filled data are in effect incomplete points associated with the union of the corresponding projected subspaces, with respect to which the point being expressed is complete. The significance of this phenomenon potentially extends to the entire class of self-expressive methods.
APA
Tsakiris, M. & Vidal, R.. (2018). Theoretical Analysis of Sparse Subspace Clustering with Missing Entries. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4975-4984 Available from https://proceedings.mlr.press/v80/tsakiris18a.html.

Related Material