Pattern Discovery via Entropy Minimization

Matthew Brand
Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, PMLR R2:10-17, 1999.

Abstract

We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy minimization, posterior maximization, and free energy minimization are all equivalent. Solutions for the maximum a posteriori (MAP) estimator yield powerful learning algorithms that combine all the charms of expectation-maximization and deterministic annealing. Contained as special cases are the methods of maximum entropy, maximum likelihood, and a new method, maximum structure. We focus on the maximum structure case, in which entropy minimization maximizes the amount of evidence supporting each parameter while minimizing uncertainty in the sufficient statistics and cross-entropy between the model and the data. In iterative estimation, the MAP estimator gradually extinguishes excess parameters, sculpting a model structure that reflects hidden structures in the data. These models are highly resistant to over-fitting and have the particular virtue of being easy to interpret, often yielding insights into the hidden causes that generate the data.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR2-brand99a, title = {Pattern Discovery via Entropy Minimization}, author = {Brand, Matthew}, booktitle = {Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics}, pages = {10--17}, year = {1999}, editor = {Heckerman, David and Whittaker, Joe}, volume = {R2}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r2/brand99a/brand99a.pdf}, url = {https://proceedings.mlr.press/r2/brand99a.html}, abstract = {We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy minimization, posterior maximization, and free energy minimization are all equivalent. Solutions for the maximum a posteriori (MAP) estimator yield powerful learning algorithms that combine all the charms of expectation-maximization and deterministic annealing. Contained as special cases are the methods of maximum entropy, maximum likelihood, and a new method, maximum structure. We focus on the maximum structure case, in which entropy minimization maximizes the amount of evidence supporting each parameter while minimizing uncertainty in the sufficient statistics and cross-entropy between the model and the data. In iterative estimation, the MAP estimator gradually extinguishes excess parameters, sculpting a model structure that reflects hidden structures in the data. These models are highly resistant to over-fitting and have the particular virtue of being easy to interpret, often yielding insights into the hidden causes that generate the data.}, note = {Reissued by PMLR on 20 August 2020.} }
Endnote
%0 Conference Paper %T Pattern Discovery via Entropy Minimization %A Matthew Brand %B Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1999 %E David Heckerman %E Joe Whittaker %F pmlr-vR2-brand99a %I PMLR %P 10--17 %U https://proceedings.mlr.press/r2/brand99a.html %V R2 %X We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy minimization, posterior maximization, and free energy minimization are all equivalent. Solutions for the maximum a posteriori (MAP) estimator yield powerful learning algorithms that combine all the charms of expectation-maximization and deterministic annealing. Contained as special cases are the methods of maximum entropy, maximum likelihood, and a new method, maximum structure. We focus on the maximum structure case, in which entropy minimization maximizes the amount of evidence supporting each parameter while minimizing uncertainty in the sufficient statistics and cross-entropy between the model and the data. In iterative estimation, the MAP estimator gradually extinguishes excess parameters, sculpting a model structure that reflects hidden structures in the data. These models are highly resistant to over-fitting and have the particular virtue of being easy to interpret, often yielding insights into the hidden causes that generate the data. %Z Reissued by PMLR on 20 August 2020.
APA
Brand, M.. (1999). Pattern Discovery via Entropy Minimization. Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R2:10-17 Available from https://proceedings.mlr.press/r2/brand99a.html. Reissued by PMLR on 20 August 2020.

Related Material