[edit]
Document Retrieval and Clustering: from Principal Component Analysis to Self-aggregation Networks
Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, PMLR R4:85-92, 2003.
Abstract
Abstract. We first extend Hopfield networks to clustering bipartite graphs (words-to-document association) and show that the solution is the principal component analysis. We then generalize this via the min-max clustering principle into a self-aggregation networks which are composed of scaled PCA components via Hebb rule. Clustering amounts to an updating process where connections between different clusters are automatically suppressed while connections within same clusters are enhanced. This framework combines dimension reduction with clustering via neural networks and PCA. Self-aggregation networks can also improve information retrieval performance. Applications are presented.