Learning in Hybrid Noise Environments Using Statistical Queries

Scott Evan Decatur
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:175-185, 1995.

Abstract

We consider theoretical models of learning from noisy data. Specifically, we focus on learning in the probability approximately correct model as defined by Valiant. Two of the most widely studied models of noise in this setting have been classification noise and malicious errors. However, a more realistic model combining the two types of noise has not been formalized. We define a learning environment based on a natural combination of these two noise models. We first show that hypothesis testing is possible in this model. We next describe a simple technique for learning in this model, and then describe a more powerful technique based on statistical query learning. We show that the noise tolerance of this improved technique is roughly optimal with respect to the tolerance of the statistical query algorithm and that it provides a smooth tradeoff between the tolerable amounts of the two types of noise. Finally, we show that statistical query simulation yields learning algorithms for other combinations of noise models, thus demonstrating that statistical query specification truly captures the generic fault tolerance of a learning algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR0-decatur95a, title = {Learning in Hybrid Noise Environments Using Statistical Queries}, author = {Decatur, Scott Evan}, booktitle = {Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics}, pages = {175--185}, year = {1995}, editor = {Fisher, Doug and Lenz, Hans-Joachim}, volume = {R0}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/r0/decatur95a/decatur95a.pdf}, url = {https://proceedings.mlr.press/r0/decatur95a.html}, abstract = {We consider theoretical models of learning from noisy data. Specifically, we focus on learning in the probability approximately correct model as defined by Valiant. Two of the most widely studied models of noise in this setting have been classification noise and malicious errors. However, a more realistic model combining the two types of noise has not been formalized. We define a learning environment based on a natural combination of these two noise models. We first show that hypothesis testing is possible in this model. We next describe a simple technique for learning in this model, and then describe a more powerful technique based on statistical query learning. We show that the noise tolerance of this improved technique is roughly optimal with respect to the tolerance of the statistical query algorithm and that it provides a smooth tradeoff between the tolerable amounts of the two types of noise. Finally, we show that statistical query simulation yields learning algorithms for other combinations of noise models, thus demonstrating that statistical query specification truly captures the generic fault tolerance of a learning algorithm.}, note = {Reissued by PMLR on 01 May 2022.} }
Endnote
%0 Conference Paper %T Learning in Hybrid Noise Environments Using Statistical Queries %A Scott Evan Decatur %B Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1995 %E Doug Fisher %E Hans-Joachim Lenz %F pmlr-vR0-decatur95a %I PMLR %P 175--185 %U https://proceedings.mlr.press/r0/decatur95a.html %V R0 %X We consider theoretical models of learning from noisy data. Specifically, we focus on learning in the probability approximately correct model as defined by Valiant. Two of the most widely studied models of noise in this setting have been classification noise and malicious errors. However, a more realistic model combining the two types of noise has not been formalized. We define a learning environment based on a natural combination of these two noise models. We first show that hypothesis testing is possible in this model. We next describe a simple technique for learning in this model, and then describe a more powerful technique based on statistical query learning. We show that the noise tolerance of this improved technique is roughly optimal with respect to the tolerance of the statistical query algorithm and that it provides a smooth tradeoff between the tolerable amounts of the two types of noise. Finally, we show that statistical query simulation yields learning algorithms for other combinations of noise models, thus demonstrating that statistical query specification truly captures the generic fault tolerance of a learning algorithm. %Z Reissued by PMLR on 01 May 2022.
APA
Decatur, S.E.. (1995). Learning in Hybrid Noise Environments Using Statistical Queries. Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R0:175-185 Available from https://proceedings.mlr.press/r0/decatur95a.html. Reissued by PMLR on 01 May 2022.

Related Material