Information Theoretic Model Validation for Spectral Clustering

Morteza Haghir Chehreghani, Alberto Giovanni Busetto, Joachim M. Buhmann
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:495-503, 2012.

Abstract

Model validation constitutes a fundamental step in data clustering. The central question is: Which cluster model and how many clusters are most appropriate for a certain application? In this study, we introduce a method for the validation of spectral clustering based upon approximation set coding. In particular, we compare correlation and pairwise clustering to analyze the correlations of temporal gene expression profiles. To evaluate and select clustering models, we calculate their reliable informativeness. Experimental results in the context of gene expression analysis show that pairwise clustering yields superior amounts of reliable information. The analysis results are consistent with the Bayesian Information Criterion (BIC), and exhibit higher generality than BIC.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-haghir12, title = {Information Theoretic Model Validation for Spectral Clustering}, author = {Chehreghani, Morteza Haghir and Busetto, Alberto Giovanni and Buhmann, Joachim M.}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {495--503}, year = {2012}, editor = {Lawrence, Neil D. and Girolami, Mark}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/haghir12/haghir12.pdf}, url = {https://proceedings.mlr.press/v22/haghir12.html}, abstract = {Model validation constitutes a fundamental step in data clustering. The central question is: Which cluster model and how many clusters are most appropriate for a certain application? In this study, we introduce a method for the validation of spectral clustering based upon approximation set coding. In particular, we compare correlation and pairwise clustering to analyze the correlations of temporal gene expression profiles. To evaluate and select clustering models, we calculate their reliable informativeness. Experimental results in the context of gene expression analysis show that pairwise clustering yields superior amounts of reliable information. The analysis results are consistent with the Bayesian Information Criterion (BIC), and exhibit higher generality than BIC.} }
Endnote
%0 Conference Paper %T Information Theoretic Model Validation for Spectral Clustering %A Morteza Haghir Chehreghani %A Alberto Giovanni Busetto %A Joachim M. Buhmann %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-haghir12 %I PMLR %P 495--503 %U https://proceedings.mlr.press/v22/haghir12.html %V 22 %X Model validation constitutes a fundamental step in data clustering. The central question is: Which cluster model and how many clusters are most appropriate for a certain application? In this study, we introduce a method for the validation of spectral clustering based upon approximation set coding. In particular, we compare correlation and pairwise clustering to analyze the correlations of temporal gene expression profiles. To evaluate and select clustering models, we calculate their reliable informativeness. Experimental results in the context of gene expression analysis show that pairwise clustering yields superior amounts of reliable information. The analysis results are consistent with the Bayesian Information Criterion (BIC), and exhibit higher generality than BIC.
RIS
TY - CPAPER TI - Information Theoretic Model Validation for Spectral Clustering AU - Morteza Haghir Chehreghani AU - Alberto Giovanni Busetto AU - Joachim M. Buhmann BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-haghir12 PB - PMLR DP - Proceedings of Machine Learning Research VL - 22 SP - 495 EP - 503 L1 - http://proceedings.mlr.press/v22/haghir12/haghir12.pdf UR - https://proceedings.mlr.press/v22/haghir12.html AB - Model validation constitutes a fundamental step in data clustering. The central question is: Which cluster model and how many clusters are most appropriate for a certain application? In this study, we introduce a method for the validation of spectral clustering based upon approximation set coding. In particular, we compare correlation and pairwise clustering to analyze the correlations of temporal gene expression profiles. To evaluate and select clustering models, we calculate their reliable informativeness. Experimental results in the context of gene expression analysis show that pairwise clustering yields superior amounts of reliable information. The analysis results are consistent with the Bayesian Information Criterion (BIC), and exhibit higher generality than BIC. ER -
APA
Chehreghani, M.H., Busetto, A.G. & Buhmann, J.M.. (2012). Information Theoretic Model Validation for Spectral Clustering. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:495-503 Available from https://proceedings.mlr.press/v22/haghir12.html.

Related Material