Mixability is Bayes Risk Curvature Relative to Log Loss

Tim Erven; Mark D. Reid; Robert C. Williamson

Mixability is Bayes Risk Curvature Relative to Log Loss

Tim Erven, Mark D. Reid, Robert C. Williamson

Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:233-252, 2011.

Abstract

Mixability of a loss governs the best possible performance when aggregating expert predictions with respect to that loss. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses. We then extend this result to multiclass proper losses where there are few existing results. We show that mixability is governed by the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses.

Cite this Paper

BibTeX


@InProceedings{pmlr-v19-vanerven11a,
  title = 	 {Mixability is Bayes Risk Curvature Relative to Log Loss},
  author = 	 {Erven, Tim and Reid, Mark D. and Williamson, Robert C.},
  booktitle = 	 {Proceedings of the 24th Annual Conference on Learning Theory},
  pages = 	 {233--252},
  year = 	 {2011},
  editor = 	 {Kakade, Sham M. and von Luxburg, Ulrike},
  volume = 	 {19},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Budapest, Hungary},
  month = 	 {09--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v19/vanerven11a/vanerven11a.pdf},
  url = 	 {https://proceedings.mlr.press/v19/vanerven11a.html},
  abstract = 	 {Mixability of a loss governs the best possible performance when aggregating expert predictions with respect to that loss. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this  transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses.  We then extend this result to multiclass proper losses where there are few existing results.  We show that mixability is governed by the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses.}
}

Endnote

%0 Conference Paper
%T Mixability is Bayes Risk Curvature Relative to Log Loss
%A Tim Erven
%A Mark D. Reid
%A Robert C. Williamson
%B Proceedings of the 24th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2011
%E Sham M. Kakade
%E Ulrike von Luxburg	
%F pmlr-v19-vanerven11a
%I PMLR
%P 233--252
%U https://proceedings.mlr.press/v19/vanerven11a.html
%V 19
%X Mixability of a loss governs the best possible performance when aggregating expert predictions with respect to that loss. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this  transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses.  We then extend this result to multiclass proper losses where there are few existing results.  We show that mixability is governed by the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses.

RIS


TY  - CPAPER
TI  - Mixability is Bayes Risk Curvature Relative to Log Loss
AU  - Tim Erven
AU  - Mark D. Reid
AU  - Robert C. Williamson
BT  - Proceedings of the 24th Annual Conference on Learning Theory
DA  - 2011/12/21
ED  - Sham M. Kakade
ED  - Ulrike von Luxburg	
ID  - pmlr-v19-vanerven11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 19
SP  - 233
EP  - 252
L1  - http://proceedings.mlr.press/v19/vanerven11a/vanerven11a.pdf
UR  - https://proceedings.mlr.press/v19/vanerven11a.html
AB  - Mixability of a loss governs the best possible performance when aggregating expert predictions with respect to that loss. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this  transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses.  We then extend this result to multiclass proper losses where there are few existing results.  We show that mixability is governed by the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses.
ER  -

APA


Erven, T., Reid, M.D. & Williamson, R.C.. (2011). Mixability is Bayes Risk Curvature Relative to Log Loss. Proceedings of the 24th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 19:233-252 Available from https://proceedings.mlr.press/v19/vanerven11a.html.

Related Material

Download PDF