Information Theoretical Clustering via Semidefinite Programming


Meihong Wang, Fei Sha ;
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:761-769, 2011.


We propose techniques of convex optimization for information theoretical clustering. The clustering objective is to maximize the mutual information between data points and cluster assignments. We formulate this problem first as an instance of MAX K CUT on weighted graphs. We then apply the technique of semidefinite programming (SDP) relaxation to obtain a convex SDP problem. We show how the solution of the SDP problem can be further improved with a low-rank refinement heuristic. The low-rank solution reveals more clearly the cluster structure of the data. Empirical studies on several datasets demonstrate the effectiveness of our approach. In particular, the approach outperforms several other clustering algorithms when compared on standard evaluation metrics. [pdf]

Related Material