Predictive Discretization during Model Selection

Harald Steck, Tommi S. Jaakkola
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:532-539, 2007.

Abstract

We present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which inherently ensures the optimal trade-off between goodness of fit and model complexity (including the number of discretization levels). Using the so-called finest grid implied by the data, our scoring function depends only on the number of data points in the various discretization levels. Not only can it be computed efficiently, but it is also invariant under monotonic transformations of the continuous space. Our experiments show that the discretization method can substantially impact the resulting graph structure.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-steck07a, title = {Predictive Discretization during Model Selection}, author = {Steck, Harald and Jaakkola, Tommi S.}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {532--539}, year = {2007}, editor = {Meila, Marina and Shen, Xiaotong}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/steck07a/steck07a.pdf}, url = {https://proceedings.mlr.press/v2/steck07a.html}, abstract = {We present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which inherently ensures the optimal trade-off between goodness of fit and model complexity (including the number of discretization levels). Using the so-called finest grid implied by the data, our scoring function depends only on the number of data points in the various discretization levels. Not only can it be computed efficiently, but it is also invariant under monotonic transformations of the continuous space. Our experiments show that the discretization method can substantially impact the resulting graph structure.} }
Endnote
%0 Conference Paper %T Predictive Discretization during Model Selection %A Harald Steck %A Tommi S. Jaakkola %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-steck07a %I PMLR %P 532--539 %U https://proceedings.mlr.press/v2/steck07a.html %V 2 %X We present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which inherently ensures the optimal trade-off between goodness of fit and model complexity (including the number of discretization levels). Using the so-called finest grid implied by the data, our scoring function depends only on the number of data points in the various discretization levels. Not only can it be computed efficiently, but it is also invariant under monotonic transformations of the continuous space. Our experiments show that the discretization method can substantially impact the resulting graph structure.
RIS
TY - CPAPER TI - Predictive Discretization during Model Selection AU - Harald Steck AU - Tommi S. Jaakkola BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-steck07a PB - PMLR DP - Proceedings of Machine Learning Research VL - 2 SP - 532 EP - 539 L1 - http://proceedings.mlr.press/v2/steck07a/steck07a.pdf UR - https://proceedings.mlr.press/v2/steck07a.html AB - We present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which inherently ensures the optimal trade-off between goodness of fit and model complexity (including the number of discretization levels). Using the so-called finest grid implied by the data, our scoring function depends only on the number of data points in the various discretization levels. Not only can it be computed efficiently, but it is also invariant under monotonic transformations of the continuous space. Our experiments show that the discretization method can substantially impact the resulting graph structure. ER -
APA
Steck, H. & Jaakkola, T.S.. (2007). Predictive Discretization during Model Selection. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 2:532-539 Available from https://proceedings.mlr.press/v2/steck07a.html.

Related Material