On Predictive Classification of Binary Vectors

Mats Gyllenberg, Timo Koski
Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, PMLR R1:239-242, 1997.

Abstract

The problem of rational classification of a database of binary vectors is analyzed by means of a family of Bayesian predictive distributions on the binary hypercube. The general notion of predictive classification was probably first discussed by S. Geisser. The predictive distributions are expressed in terms of a finite number observables based on a given set of binary vectors (predictors or centroids) representing a system of classes and an entropy-maximizing family of probability distributions. We derive the (non-probabilistic) criterion of maximal predictive classification due to J . Gower (1974) as a special case of a Bayesian predictive classification. The notion of a predictive distribution will be related to stochastic complexity of a set of data with respect to a family of statistical distributions. An application to bacterial identification will be presented using a database of Enterobacteriaceae as in Gyllenberg (1996 c). A framework for the analysis is provided by a theorem about the merging of opinions due to Blackwell and Dubins (1962). We prove certain results about the asymptotic properties of the predictive learning process.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR1-gyllenberg97a, title = {On Predictive Classification of Binary Vectors}, author = {Gyllenberg, Mats and Koski, Timo}, booktitle = {Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics}, pages = {239--242}, year = {1997}, editor = {Madigan, David and Smyth, Padhraic}, volume = {R1}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r1/gyllenberg97a/gyllenberg97a.pdf}, url = {https://proceedings.mlr.press/r1/gyllenberg97a.html}, abstract = {The problem of rational classification of a database of binary vectors is analyzed by means of a family of Bayesian predictive distributions on the binary hypercube. The general notion of predictive classification was probably first discussed by S. Geisser. The predictive distributions are expressed in terms of a finite number observables based on a given set of binary vectors (predictors or centroids) representing a system of classes and an entropy-maximizing family of probability distributions. We derive the (non-probabilistic) criterion of maximal predictive classification due to J . Gower (1974) as a special case of a Bayesian predictive classification. The notion of a predictive distribution will be related to stochastic complexity of a set of data with respect to a family of statistical distributions. An application to bacterial identification will be presented using a database of Enterobacteriaceae as in Gyllenberg (1996 c). A framework for the analysis is provided by a theorem about the merging of opinions due to Blackwell and Dubins (1962). We prove certain results about the asymptotic properties of the predictive learning process.}, note = {Reissued by PMLR on 30 March 2021.} }
Endnote
%0 Conference Paper %T On Predictive Classification of Binary Vectors %A Mats Gyllenberg %A Timo Koski %B Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1997 %E David Madigan %E Padhraic Smyth %F pmlr-vR1-gyllenberg97a %I PMLR %P 239--242 %U https://proceedings.mlr.press/r1/gyllenberg97a.html %V R1 %X The problem of rational classification of a database of binary vectors is analyzed by means of a family of Bayesian predictive distributions on the binary hypercube. The general notion of predictive classification was probably first discussed by S. Geisser. The predictive distributions are expressed in terms of a finite number observables based on a given set of binary vectors (predictors or centroids) representing a system of classes and an entropy-maximizing family of probability distributions. We derive the (non-probabilistic) criterion of maximal predictive classification due to J . Gower (1974) as a special case of a Bayesian predictive classification. The notion of a predictive distribution will be related to stochastic complexity of a set of data with respect to a family of statistical distributions. An application to bacterial identification will be presented using a database of Enterobacteriaceae as in Gyllenberg (1996 c). A framework for the analysis is provided by a theorem about the merging of opinions due to Blackwell and Dubins (1962). We prove certain results about the asymptotic properties of the predictive learning process. %Z Reissued by PMLR on 30 March 2021.
APA
Gyllenberg, M. & Koski, T.. (1997). On Predictive Classification of Binary Vectors. Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R1:239-242 Available from https://proceedings.mlr.press/r1/gyllenberg97a.html. Reissued by PMLR on 30 March 2021.

Related Material