Discriminant Analysis on Dissimilarity Data : a New Fast Gaussian like Algorithm
Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, PMLR R3:117-122, 2001.
Classifying objects according to their proximity is the fundamental task of pattern recognition and arises as a classification problem or discriminant analysis in experimental sciences. Here we consider a particular point of view on discriminant analysis from a dissimilarity data table. We develop a new approach, inspired from the Gaussian model in discriminant analysis, which defines a set a decision rules from simple statistics on the dissimilarity matrix between observations. This matrix can be only sparse dealing with huge databases. Numerical experiments on artificial and real data (proteins classification) show interesting behaviour compared to a $K$NN classifier, (i) equivalent error rate, (ii) dramatically lower CPU times and (iii) more robustness with sparse dissimilarity structure up to $40 %$ of actual dissimilarity measures.