MDL Histogram Density Estimation

Petri Kontkanen, Petri Myllymäki
; Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:219-226, 2007.

Abstract

We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDL-based model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this framework can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-kontkanen07a, title = {MDL Histogram Density Estimation}, author = {Petri Kontkanen and Petri Myllymäki}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {219--226}, year = {2007}, editor = {Marina Meila and Xiaotong Shen}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/kontkanen07a/kontkanen07a.pdf}, url = {http://proceedings.mlr.press/v2/kontkanen07a.html}, abstract = {We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDL-based model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this framework can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests.} }
Endnote
%0 Conference Paper %T MDL Histogram Density Estimation %A Petri Kontkanen %A Petri Myllymäki %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-kontkanen07a %I PMLR %J Proceedings of Machine Learning Research %P 219--226 %U http://proceedings.mlr.press %V 2 %W PMLR %X We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDL-based model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this framework can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests.
RIS
TY - CPAPER TI - MDL Histogram Density Estimation AU - Petri Kontkanen AU - Petri Myllymäki BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics PY - 2007/03/11 DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-kontkanen07a PB - PMLR SP - 219 DP - PMLR EP - 226 L1 - http://proceedings.mlr.press/v2/kontkanen07a/kontkanen07a.pdf UR - http://proceedings.mlr.press/v2/kontkanen07a.html AB - We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDL-based model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this framework can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Finally, we demonstrate our approach via simulation tests. ER -
APA
Kontkanen, P. & Myllymäki, P.. (2007). MDL Histogram Density Estimation. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in PMLR 2:219-226

Related Material