Understanding Doubly Stochastic Clustering

Tianjiao Ding; Derek Lim; Rene Vidal; Benjamin D Haeffele

Understanding Doubly Stochastic Clustering

Tianjiao Ding, Derek Lim, Rene Vidal, Benjamin D Haeffele

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:5153-5165, 2022.

Abstract

The problem of projecting a matrix onto the space of doubly stochastic matrices finds several applications in machine learning. For example, in spectral clustering, it has been shown that forming the normalized Laplacian matrix from a data affinity matrix has close connections to projecting it onto the set of doubly stochastic matrices. However, the analysis of why this projection improves clustering has been limited. In this paper we present theoretical conditions on the given affinity matrix under which its doubly stochastic projection is an ideal affinity matrix (i.e., it has no false connections between clusters, and is well-connected within each cluster). In particular, we show that a necessary and sufficient condition for a projected affinity matrix to be ideal reduces to a set of conditions on the input affinity that decompose along each cluster. Further, in the subspace clustering problem, where each cluster is defined by a linear subspace, we provide geometric conditions on the underlying subspaces which guarantee correct clustering via a continuous version of the problem. This allows us to explain theoretically the remarkable performance of a recently proposed doubly stochastic subspace clustering method.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-ding22a,
  title = 	 {Understanding Doubly Stochastic Clustering},
  author =       {Ding, Tianjiao and Lim, Derek and Vidal, Rene and Haeffele, Benjamin D},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {5153--5165},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/ding22a/ding22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/ding22a.html},
  abstract = 	 {The problem of projecting a matrix onto the space of doubly stochastic matrices finds several applications in machine learning. For example, in spectral clustering, it has been shown that forming the normalized Laplacian matrix from a data affinity matrix has close connections to projecting it onto the set of doubly stochastic matrices. However, the analysis of why this projection improves clustering has been limited. In this paper we present theoretical conditions on the given affinity matrix under which its doubly stochastic projection is an ideal affinity matrix (i.e., it has no false connections between clusters, and is well-connected within each cluster). In particular, we show that a necessary and sufficient condition for a projected affinity matrix to be ideal reduces to a set of conditions on the input affinity that decompose along each cluster. Further, in the subspace clustering problem, where each cluster is defined by a linear subspace, we provide geometric conditions on the underlying subspaces which guarantee correct clustering via a continuous version of the problem. This allows us to explain theoretically the remarkable performance of a recently proposed doubly stochastic subspace clustering method.}
}

Endnote

%0 Conference Paper
%T Understanding Doubly Stochastic Clustering
%A Tianjiao Ding
%A Derek Lim
%A Rene Vidal
%A Benjamin D Haeffele
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-ding22a
%I PMLR
%P 5153--5165
%U https://proceedings.mlr.press/v162/ding22a.html
%V 162
%X The problem of projecting a matrix onto the space of doubly stochastic matrices finds several applications in machine learning. For example, in spectral clustering, it has been shown that forming the normalized Laplacian matrix from a data affinity matrix has close connections to projecting it onto the set of doubly stochastic matrices. However, the analysis of why this projection improves clustering has been limited. In this paper we present theoretical conditions on the given affinity matrix under which its doubly stochastic projection is an ideal affinity matrix (i.e., it has no false connections between clusters, and is well-connected within each cluster). In particular, we show that a necessary and sufficient condition for a projected affinity matrix to be ideal reduces to a set of conditions on the input affinity that decompose along each cluster. Further, in the subspace clustering problem, where each cluster is defined by a linear subspace, we provide geometric conditions on the underlying subspaces which guarantee correct clustering via a continuous version of the problem. This allows us to explain theoretically the remarkable performance of a recently proposed doubly stochastic subspace clustering method.

APA


Ding, T., Lim, D., Vidal, R. & Haeffele, B.D.. (2022). Understanding Doubly Stochastic Clustering. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:5153-5165 Available from https://proceedings.mlr.press/v162/ding22a.html.

Related Material

Download PDF