Hierarchical Convex NMF for Clustering Massive Data

Kristian Kersting; Mirwaes Wahabzada; Christian Thurau; Christian Bauckhage

Hierarchical Convex NMF for Clustering Massive Data

Kristian Kersting, Mirwaes Wahabzada, Christian Thurau, Christian Bauckhage

Proceedings of 2nd Asian Conference on Machine Learning, PMLR 13:253-268, 2010.

Abstract

We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization or Archetypal Analysis. CHNMF factorizes a non-negative data matrix $V$ into two non-negative matrix factors $V \approx WH$ such that the columns of $W$ are convex combinations of certain data points so that they are readily interpretable to data analysts. There is, however, no free lunch: imposing convexity constraints on W typically prevents adaptation to intrinsic, low dimensional structures in the data. Alas, in cases where the data is distributed in a non-convex manner or consists of mixtures of lower dimensional convex distributions, the cluster representatives obtained from CH-NMF will be less meaningful. In this paper, we present a hierarchical CH-NMF that automatically adapts to internal structures of a dataset, hence it yields meaningful and interpretable clusters for non-convex datasets. This is also confirmed by our extensive evaluation on DBLP publication records of $760,000$ authors, $4,000,000$ images harvested from the web, and $150,000,000$ votes on World of Warcraft guilds.

Cite this Paper

BibTeX


@InProceedings{pmlr-v13-kersting10a,
  title = 	 {Hierarchical Convex NMF for Clustering Massive Data},
  author = 	 {Kersting, Kristian and Wahabzada, Mirwaes and Thurau, Christian and Bauckhage, Christian},
  booktitle = 	 {Proceedings of 2nd Asian Conference on Machine Learning},
  pages = 	 {253--268},
  year = 	 {2010},
  editor = 	 {Sugiyama, Masashi and Yang, Qiang},
  volume = 	 {13},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Tokyo, Japan},
  month = 	 {08--10 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v13/kersting10a/kersting10a.pdf},
  url = 	 {https://proceedings.mlr.press/v13/kersting10a.html},
  abstract = 	 {We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization or Archetypal Analysis. CHNMF factorizes a non-negative data matrix $V$ into two non-negative matrix factors $V \approx WH$ such that the columns of $W$ are convex combinations of certain data points so that they are readily interpretable to data analysts. There is, however, no free lunch: imposing convexity constraints on W typically prevents adaptation to intrinsic, low dimensional structures in the data. Alas, in cases where the data is distributed in a non-convex manner or consists of mixtures of lower dimensional convex distributions, the cluster representatives obtained from CH-NMF will be less meaningful. In this paper, we present a hierarchical CH-NMF that automatically adapts to internal structures of a dataset, hence it yields meaningful and interpretable clusters for non-convex datasets. This is also confirmed by our extensive evaluation on DBLP publication records of $760,000$ authors, $4,000,000$ images harvested from the web, and $150,000,000$ votes on World of Warcraft guilds.}
}

Endnote

%0 Conference Paper
%T Hierarchical Convex NMF for Clustering Massive Data
%A Kristian Kersting
%A Mirwaes Wahabzada
%A Christian Thurau
%A Christian Bauckhage
%B Proceedings of 2nd Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2010
%E Masashi Sugiyama
%E Qiang Yang	
%F pmlr-v13-kersting10a
%I PMLR
%P 253--268
%U https://proceedings.mlr.press/v13/kersting10a.html
%V 13
%X We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization or Archetypal Analysis. CHNMF factorizes a non-negative data matrix $V$ into two non-negative matrix factors $V \approx WH$ such that the columns of $W$ are convex combinations of certain data points so that they are readily interpretable to data analysts. There is, however, no free lunch: imposing convexity constraints on W typically prevents adaptation to intrinsic, low dimensional structures in the data. Alas, in cases where the data is distributed in a non-convex manner or consists of mixtures of lower dimensional convex distributions, the cluster representatives obtained from CH-NMF will be less meaningful. In this paper, we present a hierarchical CH-NMF that automatically adapts to internal structures of a dataset, hence it yields meaningful and interpretable clusters for non-convex datasets. This is also confirmed by our extensive evaluation on DBLP publication records of $760,000$ authors, $4,000,000$ images harvested from the web, and $150,000,000$ votes on World of Warcraft guilds.

RIS


TY  - CPAPER
TI  - Hierarchical Convex NMF for Clustering Massive Data
AU  - Kristian Kersting
AU  - Mirwaes Wahabzada
AU  - Christian Thurau
AU  - Christian Bauckhage
BT  - Proceedings of 2nd Asian Conference on Machine Learning
DA  - 2010/10/31
ED  - Masashi Sugiyama
ED  - Qiang Yang	
ID  - pmlr-v13-kersting10a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 13
SP  - 253
EP  - 268
L1  - http://proceedings.mlr.press/v13/kersting10a/kersting10a.pdf
UR  - https://proceedings.mlr.press/v13/kersting10a.html
AB  - We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization or Archetypal Analysis. CHNMF factorizes a non-negative data matrix $V$ into two non-negative matrix factors $V \approx WH$ such that the columns of $W$ are convex combinations of certain data points so that they are readily interpretable to data analysts. There is, however, no free lunch: imposing convexity constraints on W typically prevents adaptation to intrinsic, low dimensional structures in the data. Alas, in cases where the data is distributed in a non-convex manner or consists of mixtures of lower dimensional convex distributions, the cluster representatives obtained from CH-NMF will be less meaningful. In this paper, we present a hierarchical CH-NMF that automatically adapts to internal structures of a dataset, hence it yields meaningful and interpretable clusters for non-convex datasets. This is also confirmed by our extensive evaluation on DBLP publication records of $760,000$ authors, $4,000,000$ images harvested from the web, and $150,000,000$ votes on World of Warcraft guilds.
ER  -

APA


Kersting, K., Wahabzada, M., Thurau, C. & Bauckhage, C.. (2010). Hierarchical Convex NMF for Clustering Massive Data. Proceedings of 2nd Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 13:253-268 Available from https://proceedings.mlr.press/v13/kersting10a.html.

Related Material

Download PDF