Discriminant Analysis on Dissimilarity Data : a New Fast Gaussian like Algorithm

Anne Guérin-Dugué, Gilles Celeux
Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, PMLR R3:117-122, 2001.

Abstract

Classifying objects according to their proximity is the fundamental task of pattern recognition and arises as a classification problem or discriminant analysis in experimental sciences. Here we consider a particular point of view on discriminant analysis from a dissimilarity data table. We develop a new approach, inspired from the Gaussian model in discriminant analysis, which defines a set a decision rules from simple statistics on the dissimilarity matrix between observations. This matrix can be only sparse dealing with huge databases. Numerical experiments on artificial and real data (proteins classification) show interesting behaviour compared to a $K$NN classifier, (i) equivalent error rate, (ii) dramatically lower CPU times and (iii) more robustness with sparse dissimilarity structure up to $40 %$ of actual dissimilarity measures.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR3-guerin-dugue01a, title = {Discriminant Analysis on Dissimilarity Data : a New Fast Gaussian like Algorithm}, author = {Gu{\'{e}}rin{-}Dugu{\'{e}}, Anne and Celeux, Gilles}, booktitle = {Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics}, pages = {117--122}, year = {2001}, editor = {Richardson, Thomas S. and Jaakkola, Tommi S.}, volume = {R3}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r3/guerin-dugue01a/guerin-dugue01a.pdf}, url = {https://proceedings.mlr.press/r3/guerin-dugue01a.html}, abstract = {Classifying objects according to their proximity is the fundamental task of pattern recognition and arises as a classification problem or discriminant analysis in experimental sciences. Here we consider a particular point of view on discriminant analysis from a dissimilarity data table. We develop a new approach, inspired from the Gaussian model in discriminant analysis, which defines a set a decision rules from simple statistics on the dissimilarity matrix between observations. This matrix can be only sparse dealing with huge databases. Numerical experiments on artificial and real data (proteins classification) show interesting behaviour compared to a $K$NN classifier, (i) equivalent error rate, (ii) dramatically lower CPU times and (iii) more robustness with sparse dissimilarity structure up to $40 %$ of actual dissimilarity measures.}, note = {Reissued by PMLR on 31 March 2021.} }
Endnote
%0 Conference Paper %T Discriminant Analysis on Dissimilarity Data : a New Fast Gaussian like Algorithm %A Anne Guérin-Dugué %A Gilles Celeux %B Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2001 %E Thomas S. Richardson %E Tommi S. Jaakkola %F pmlr-vR3-guerin-dugue01a %I PMLR %P 117--122 %U https://proceedings.mlr.press/r3/guerin-dugue01a.html %V R3 %X Classifying objects according to their proximity is the fundamental task of pattern recognition and arises as a classification problem or discriminant analysis in experimental sciences. Here we consider a particular point of view on discriminant analysis from a dissimilarity data table. We develop a new approach, inspired from the Gaussian model in discriminant analysis, which defines a set a decision rules from simple statistics on the dissimilarity matrix between observations. This matrix can be only sparse dealing with huge databases. Numerical experiments on artificial and real data (proteins classification) show interesting behaviour compared to a $K$NN classifier, (i) equivalent error rate, (ii) dramatically lower CPU times and (iii) more robustness with sparse dissimilarity structure up to $40 %$ of actual dissimilarity measures. %Z Reissued by PMLR on 31 March 2021.
APA
Guérin-Dugué, A. & Celeux, G.. (2001). Discriminant Analysis on Dissimilarity Data : a New Fast Gaussian like Algorithm. Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R3:117-122 Available from https://proceedings.mlr.press/r3/guerin-dugue01a.html. Reissued by PMLR on 31 March 2021.

Related Material