Boosting in the Presence of Massart Noise

Ilias Diakonikolas, Russell Impagliazzo, Daniel M. Kane, Rex Lei, Jessica Sorrell, Christos Tzamos
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:1585-1644, 2021.

Abstract

We study the problem of boosting the accuracy of a weak learner in the (distribution-independent) PAC model with Massart noise. In the Massart noise model, the label of each example $x$ is independently misclassified with probability $\eta(x) \leq \eta$, where $\eta<1/2$. The Massart model lies between the random classification noise model and the agnostic model. Our main positive result is the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to $\eta$. Prior to our work, no non-trivial booster was known in this setting. Moreover, we show that this error upper bound is best possible for polynomial-time black-box boosters, under standard cryptographic assumptions. Our upper and lower bounds characterize the complexity of boosting in the distribution-independent PAC model with Massart noise. As a simple application of our positive result, we give the first efficient Massart learner for unions of high-dimensional rectangles.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-diakonikolas21d, title = {Boosting in the Presence of Massart Noise}, author = {Diakonikolas, Ilias and Impagliazzo, Russell and Kane, Daniel M. and Lei, Rex and Sorrell, Jessica and Tzamos, Christos}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {1585--1644}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/diakonikolas21d/diakonikolas21d.pdf}, url = {https://proceedings.mlr.press/v134/diakonikolas21d.html}, abstract = {We study the problem of boosting the accuracy of a weak learner in the (distribution-independent) PAC model with Massart noise. In the Massart noise model, the label of each example $x$ is independently misclassified with probability $\eta(x) \leq \eta$, where $\eta<1/2$. The Massart model lies between the random classification noise model and the agnostic model. Our main positive result is the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to $\eta$. Prior to our work, no non-trivial booster was known in this setting. Moreover, we show that this error upper bound is best possible for polynomial-time black-box boosters, under standard cryptographic assumptions. Our upper and lower bounds characterize the complexity of boosting in the distribution-independent PAC model with Massart noise. As a simple application of our positive result, we give the first efficient Massart learner for unions of high-dimensional rectangles.} }
Endnote
%0 Conference Paper %T Boosting in the Presence of Massart Noise %A Ilias Diakonikolas %A Russell Impagliazzo %A Daniel M. Kane %A Rex Lei %A Jessica Sorrell %A Christos Tzamos %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-diakonikolas21d %I PMLR %P 1585--1644 %U https://proceedings.mlr.press/v134/diakonikolas21d.html %V 134 %X We study the problem of boosting the accuracy of a weak learner in the (distribution-independent) PAC model with Massart noise. In the Massart noise model, the label of each example $x$ is independently misclassified with probability $\eta(x) \leq \eta$, where $\eta<1/2$. The Massart model lies between the random classification noise model and the agnostic model. Our main positive result is the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to $\eta$. Prior to our work, no non-trivial booster was known in this setting. Moreover, we show that this error upper bound is best possible for polynomial-time black-box boosters, under standard cryptographic assumptions. Our upper and lower bounds characterize the complexity of boosting in the distribution-independent PAC model with Massart noise. As a simple application of our positive result, we give the first efficient Massart learner for unions of high-dimensional rectangles.
APA
Diakonikolas, I., Impagliazzo, R., Kane, D.M., Lei, R., Sorrell, J. & Tzamos, C.. (2021). Boosting in the Presence of Massart Noise. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:1585-1644 Available from https://proceedings.mlr.press/v134/diakonikolas21d.html.

Related Material