Safe Learning:    bridging the gap between Bayes, MDL and statistical learning theory via  empirical convexity

Peter Grünwald

Safe Learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity

Peter Grünwald

Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:397-420, 2011.

Abstract

We extend Bayesian MAP and Minimum Description Length (MDL) learning by testing whether the data can be substantially more compressed by a mixture of the MDL/MAP distribution with another element of the model, and adjusting the learning rate if this is the case. While standard Bayes and MDL can fail to converge if the model is wrong, the resulting “safe” estimator continues to achieve good rates with wrong models. Moreover, when applied to classification and regression models as considered in statistical learning theory, the approach achieves optimal rates under, e.g.,Tsybakov’s conditions, and reveals new situations in which we can penalize by $(- \log \mathrm{PRIOR})/n$ rather than $\sqrt{(- \log \mathrm{PRIOR})/n}$.

Cite this Paper

BibTeX


@InProceedings{pmlr-v19-grunwald11a,
  title = 	 {Safe Learning:    bridging the gap between Bayes, MDL and statistical learning theory via  empirical convexity},
  author = 	 {Grünwald, Peter},
  booktitle = 	 {Proceedings of the 24th Annual Conference on Learning Theory},
  pages = 	 {397--420},
  year = 	 {2011},
  editor = 	 {Kakade, Sham M. and von Luxburg, Ulrike},
  volume = 	 {19},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Budapest, Hungary},
  month = 	 {09--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v19/grunwald11a/grunwald11a.pdf},
  url = 	 {https://proceedings.mlr.press/v19/grunwald11a.html},
  abstract = 	 {We extend Bayesian MAP and Minimum Description Length (MDL) learning by testing whether the data can be substantially more compressed by a mixture of the MDL/MAP distribution with another element of the model, and adjusting the learning rate if this is the case. While standard Bayes and MDL can fail to converge if the model is wrong, the resulting “safe” estimator continues to achieve good rates with wrong models. Moreover, when applied to classification and regression models as considered in statistical learning theory, the approach achieves optimal rates under, e.g.,Tsybakov’s conditions, and reveals new situations in which we can penalize by $(- \log \mathrm{PRIOR})/n$ rather than $\sqrt{(- \log \mathrm{PRIOR})/n}$.}
}

Endnote

%0 Conference Paper
%T Safe Learning:    bridging the gap between Bayes, MDL and statistical learning theory via  empirical convexity
%A Peter Grünwald
%B Proceedings of the 24th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2011
%E Sham M. Kakade
%E Ulrike von Luxburg	
%F pmlr-v19-grunwald11a
%I PMLR
%P 397--420
%U https://proceedings.mlr.press/v19/grunwald11a.html
%V 19
%X We extend Bayesian MAP and Minimum Description Length (MDL) learning by testing whether the data can be substantially more compressed by a mixture of the MDL/MAP distribution with another element of the model, and adjusting the learning rate if this is the case. While standard Bayes and MDL can fail to converge if the model is wrong, the resulting “safe” estimator continues to achieve good rates with wrong models. Moreover, when applied to classification and regression models as considered in statistical learning theory, the approach achieves optimal rates under, e.g.,Tsybakov’s conditions, and reveals new situations in which we can penalize by $(- \log \mathrm{PRIOR})/n$ rather than $\sqrt{(- \log \mathrm{PRIOR})/n}$.

RIS


TY  - CPAPER
TI  - Safe Learning:    bridging the gap between Bayes, MDL and statistical learning theory via  empirical convexity
AU  - Peter Grünwald
BT  - Proceedings of the 24th Annual Conference on Learning Theory
DA  - 2011/12/21
ED  - Sham M. Kakade
ED  - Ulrike von Luxburg	
ID  - pmlr-v19-grunwald11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 19
SP  - 397
EP  - 420
L1  - http://proceedings.mlr.press/v19/grunwald11a/grunwald11a.pdf
UR  - https://proceedings.mlr.press/v19/grunwald11a.html
AB  - We extend Bayesian MAP and Minimum Description Length (MDL) learning by testing whether the data can be substantially more compressed by a mixture of the MDL/MAP distribution with another element of the model, and adjusting the learning rate if this is the case. While standard Bayes and MDL can fail to converge if the model is wrong, the resulting “safe” estimator continues to achieve good rates with wrong models. Moreover, when applied to classification and regression models as considered in statistical learning theory, the approach achieves optimal rates under, e.g.,Tsybakov’s conditions, and reveals new situations in which we can penalize by $(- \log \mathrm{PRIOR})/n$ rather than $\sqrt{(- \log \mathrm{PRIOR})/n}$.
ER  -

APA


Grünwald, P.. (2011). Safe Learning:    bridging the gap between Bayes, MDL and statistical learning theory via  empirical convexity. Proceedings of the 24th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 19:397-420 Available from https://proceedings.mlr.press/v19/grunwald11a.html.

Related Material

Download PDF