Consistent optimization of AMS by logistic loss minimization

Wojciech Kotłowski
Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, PMLR 42:99-108, 2015.

Abstract

In this paper, we theoretically justify an approach popular among participants of the Higgs Boson Machine Learning Challenge to optimize approximate median significance (AMS). The approach is based on the following two-stage procedure. First, a real-valued function f is learned by minimizing a surrogate loss for binary classification, such as logistic loss, on the training sample. Then, given f, a threshold \hatθ is tuned on a separate validation sample, by direct optimization of AMS. We show that the regret of the resulting classifier (obtained from thresholding f on \hatθ) measured with respect to the squared AMS, is upperbounded by the regret of f measured with respect to the logistic loss. Hence, we prove that minimizing logistic surrogate is a consistent method of optimizing AMS.

Cite this Paper


BibTeX
@InProceedings{pmlr-v42-kotl14, title = {Consistent optimization of AMS by logistic loss minimization}, author = {Kotłowski, Wojciech}, booktitle = {Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning}, pages = {99--108}, year = {2015}, editor = {Cowan, Glen and Germain, Cécile and Guyon, Isabelle and Kégl, Balázs and Rousseau, David}, volume = {42}, series = {Proceedings of Machine Learning Research}, address = {Montreal, Canada}, month = {13 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v42/kotl14.pdf}, url = {https://proceedings.mlr.press/v42/kotl14.html}, abstract = {In this paper, we theoretically justify an approach popular among participants of the Higgs Boson Machine Learning Challenge to optimize approximate median significance (AMS). The approach is based on the following two-stage procedure. First, a real-valued function f is learned by minimizing a surrogate loss for binary classification, such as logistic loss, on the training sample. Then, given f, a threshold \hatθ is tuned on a separate validation sample, by direct optimization of AMS. We show that the regret of the resulting classifier (obtained from thresholding f on \hatθ) measured with respect to the squared AMS, is upperbounded by the regret of f measured with respect to the logistic loss. Hence, we prove that minimizing logistic surrogate is a consistent method of optimizing AMS. } }
Endnote
%0 Conference Paper %T Consistent optimization of AMS by logistic loss minimization %A Wojciech Kotłowski %B Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Glen Cowan %E Cécile Germain %E Isabelle Guyon %E Balázs Kégl %E David Rousseau %F pmlr-v42-kotl14 %I PMLR %P 99--108 %U https://proceedings.mlr.press/v42/kotl14.html %V 42 %X In this paper, we theoretically justify an approach popular among participants of the Higgs Boson Machine Learning Challenge to optimize approximate median significance (AMS). The approach is based on the following two-stage procedure. First, a real-valued function f is learned by minimizing a surrogate loss for binary classification, such as logistic loss, on the training sample. Then, given f, a threshold \hatθ is tuned on a separate validation sample, by direct optimization of AMS. We show that the regret of the resulting classifier (obtained from thresholding f on \hatθ) measured with respect to the squared AMS, is upperbounded by the regret of f measured with respect to the logistic loss. Hence, we prove that minimizing logistic surrogate is a consistent method of optimizing AMS.
RIS
TY - CPAPER TI - Consistent optimization of AMS by logistic loss minimization AU - Wojciech Kotłowski BT - Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning DA - 2015/08/27 ED - Glen Cowan ED - Cécile Germain ED - Isabelle Guyon ED - Balázs Kégl ED - David Rousseau ID - pmlr-v42-kotl14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 42 SP - 99 EP - 108 L1 - http://proceedings.mlr.press/v42/kotl14.pdf UR - https://proceedings.mlr.press/v42/kotl14.html AB - In this paper, we theoretically justify an approach popular among participants of the Higgs Boson Machine Learning Challenge to optimize approximate median significance (AMS). The approach is based on the following two-stage procedure. First, a real-valued function f is learned by minimizing a surrogate loss for binary classification, such as logistic loss, on the training sample. Then, given f, a threshold \hatθ is tuned on a separate validation sample, by direct optimization of AMS. We show that the regret of the resulting classifier (obtained from thresholding f on \hatθ) measured with respect to the squared AMS, is upperbounded by the regret of f measured with respect to the logistic loss. Hence, we prove that minimizing logistic surrogate is a consistent method of optimizing AMS. ER -
APA
Kotłowski, W.. (2015). Consistent optimization of AMS by logistic loss minimization. Proceedings of the NIPS 2014 Workshop on High-energy Physics and Machine Learning, in Proceedings of Machine Learning Research 42:99-108 Available from https://proceedings.mlr.press/v42/kotl14.html.

Related Material