Safe Learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity


Peter Grünwald ;
Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:397-420, 2011.


We extend Bayesian MAP and Minimum Description Length (MDL) learning by testing whether the data can be substantially more compressed by a mixture of the MDL/MAP distribution with another element of themodel, and adjusting the learning rate if this is the case. While standard Bayes and MDL can fail to converge ifthe model is wrong, the resulting “safe” estimator continues toachieve good rates with wrong models. Moreover, when applied toclassification and regression models as considered in statisticallearning theory, the approach achieves optimal rates under, e.g.,Tsybakov’s conditions, and reveals new situations in which we canpenalize by (- \log \text\sc prior)/n rather than \sqrt(- \log \text\sc prior)/n.

Related Material