On Predictive Classification of Binary Vectors

Mats Gyllenberg; Timo Koski

On Predictive Classification of Binary Vectors

Mats Gyllenberg, Timo Koski

Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, PMLR R1:239-242, 1997.

Abstract

The problem of rational classification of a database of binary vectors is analyzed by means of a family of Bayesian predictive distributions on the binary hypercube. The general notion of predictive classification was probably first discussed by S. Geisser. The predictive distributions are expressed in terms of a finite number observables based on a given set of binary vectors (predictors or centroids) representing a system of classes and an entropy-maximizing family of probability distributions. We derive the (non-probabilistic) criterion of maximal predictive classification due to J . Gower (1974) as a special case of a Bayesian predictive classification. The notion of a predictive distribution will be related to stochastic complexity of a set of data with respect to a family of statistical distributions. An application to bacterial identification will be presented using a database of Enterobacteriaceae as in Gyllenberg (1996 c). A framework for the analysis is provided by a theorem about the merging of opinions due to Blackwell and Dubins (1962). We prove certain results about the asymptotic properties of the predictive learning process.

Cite this Paper

BibTeX


@InProceedings{pmlr-vR1-gyllenberg97a,
  title = 	 {On Predictive Classification of Binary Vectors},
  author =       {Gyllenberg, Mats and Koski, Timo},
  booktitle = 	 {Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics},
  pages = 	 {239--242},
  year = 	 {1997},
  editor = 	 {Madigan, David and Smyth, Padhraic},
  volume = 	 {R1},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {04--07 Jan},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/r1/gyllenberg97a/gyllenberg97a.pdf},
  url = 	 {https://proceedings.mlr.press/r1/gyllenberg97a.html},
  abstract = 	 {The problem of rational classification of a database of binary vectors is analyzed by means of a family of Bayesian predictive distributions on the binary hypercube. The general notion of predictive classification was probably first discussed by S. Geisser. The predictive distributions are expressed in terms of a finite number observables based on a given set of binary vectors (predictors or centroids) representing a system of classes and an entropy-maximizing family of probability distributions. We derive the (non-probabilistic) criterion of maximal predictive classification due to J . Gower (1974) as a special case of a Bayesian predictive classification. The notion of a predictive distribution will be related to stochastic complexity of a set of data with respect to a family of statistical distributions. An application to bacterial identification will be presented using a database of Enterobacteriaceae as in Gyllenberg (1996 c). A framework for the analysis is provided by a theorem about the merging of opinions due to Blackwell and Dubins (1962). We prove certain results about the asymptotic properties of the predictive learning process.},
  note =         {Reissued by PMLR on 30 March 2021.}
}

Endnote

%0 Conference Paper
%T On Predictive Classification of Binary Vectors
%A Mats Gyllenberg
%A Timo Koski
%B Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 1997
%E David Madigan
%E Padhraic Smyth	
%F pmlr-vR1-gyllenberg97a
%I PMLR
%P 239--242
%U https://proceedings.mlr.press/r1/gyllenberg97a.html
%V R1
%X The problem of rational classification of a database of binary vectors is analyzed by means of a family of Bayesian predictive distributions on the binary hypercube. The general notion of predictive classification was probably first discussed by S. Geisser. The predictive distributions are expressed in terms of a finite number observables based on a given set of binary vectors (predictors or centroids) representing a system of classes and an entropy-maximizing family of probability distributions. We derive the (non-probabilistic) criterion of maximal predictive classification due to J . Gower (1974) as a special case of a Bayesian predictive classification. The notion of a predictive distribution will be related to stochastic complexity of a set of data with respect to a family of statistical distributions. An application to bacterial identification will be presented using a database of Enterobacteriaceae as in Gyllenberg (1996 c). A framework for the analysis is provided by a theorem about the merging of opinions due to Blackwell and Dubins (1962). We prove certain results about the asymptotic properties of the predictive learning process.
%Z Reissued by PMLR on 30 March 2021.

APA


Gyllenberg, M. & Koski, T.. (1997). On Predictive Classification of Binary Vectors. Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R1:239-242 Available from https://proceedings.mlr.press/r1/gyllenberg97a.html. Reissued by PMLR on 30 March 2021.

Related Material

Download PDF