Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks

Chris Ding

Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks

Chris Ding

Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, PMLR R4:85-92, 2003.

Abstract

Abstract. We first extend Hopfield networks to clustering bipartite graphs (words-to-document association) and show that the solution is the principal component analysis. We then generalize this via the min-max clustering principle into a self-aggregation networks which are composed of scaled PCA components via Hebb rule. Clustering amounts to an updating process where connections between different clusters are automatically suppressed while connections within same clusters are enhanced. This framework combines dimension reduction with clustering via neural networks and PCA. Self-aggregation networks can also improve information retrieval performance. Applications are presented.

Cite this Paper

BibTeX


@InProceedings{pmlr-vR4-ding03a,
  title = 	 {Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks},
  author =       {Ding, Chris},
  booktitle = 	 {Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics},
  pages = 	 {85--92},
  year = 	 {2003},
  editor = 	 {Bishop, Christopher M. and Frey, Brendan J.},
  volume = 	 {R4},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--06 Jan},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/r4/ding03a/ding03a.pdf},
  url = 	 {https://proceedings.mlr.press/r4/ding03a.html},
  abstract = 	 {Abstract. We first extend Hopfield networks to clustering bipartite graphs (words-to-document association) and show that the solution is the principal component analysis. We then generalize this via the min-max clustering principle into a self-aggregation networks which are composed of scaled PCA components via Hebb rule. Clustering amounts to an updating process where connections between different clusters are automatically suppressed while connections within same clusters are enhanced. This framework combines dimension reduction with clustering via neural networks and PCA. Self-aggregation networks can also improve information retrieval performance. Applications are presented.},
  note =         {Reissued by PMLR on 01 April 2021.}
}

Endnote

%0 Conference Paper
%T Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks
%A Chris Ding
%B Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2003
%E Christopher M. Bishop
%E Brendan J. Frey	
%F pmlr-vR4-ding03a
%I PMLR
%P 85--92
%U https://proceedings.mlr.press/r4/ding03a.html
%V R4
%X Abstract. We first extend Hopfield networks to clustering bipartite graphs (words-to-document association) and show that the solution is the principal component analysis. We then generalize this via the min-max clustering principle into a self-aggregation networks which are composed of scaled PCA components via Hebb rule. Clustering amounts to an updating process where connections between different clusters are automatically suppressed while connections within same clusters are enhanced. This framework combines dimension reduction with clustering via neural networks and PCA. Self-aggregation networks can also improve information retrieval performance. Applications are presented.
%Z Reissued by PMLR on 01 April 2021.

APA


Ding, C.. (2003). Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks. Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R4:85-92 Available from https://proceedings.mlr.press/r4/ding03a.html. Reissued by PMLR on 01 April 2021.

Related Material

Download PDF