Optimal Robust Learning of Discrete Distributions from Batches

Ayush Jain, Alon Orlitsky
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4651-4660, 2020.

Abstract

Many applications, including natural language processing, sensor networks, collaborative filtering, and federated learning, call for estimating discrete distributions from data collected in batches, some of which may be untrustworthy, erroneous, faulty, or even adversarial. Previous estimators for this setting ran in exponential time, and for some regimes required a suboptimal number of batches. We provide the first polynomial-time estimator that is optimal in the number of batches and achieves essentially the best possible estimation accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-jain20a, title = {Optimal Robust Learning of Discrete Distributions from Batches}, author = {Jain, Ayush and Orlitsky, Alon}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {4651--4660}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/jain20a/jain20a.pdf}, url = {https://proceedings.mlr.press/v119/jain20a.html}, abstract = {Many applications, including natural language processing, sensor networks, collaborative filtering, and federated learning, call for estimating discrete distributions from data collected in batches, some of which may be untrustworthy, erroneous, faulty, or even adversarial. Previous estimators for this setting ran in exponential time, and for some regimes required a suboptimal number of batches. We provide the first polynomial-time estimator that is optimal in the number of batches and achieves essentially the best possible estimation accuracy.} }
Endnote
%0 Conference Paper %T Optimal Robust Learning of Discrete Distributions from Batches %A Ayush Jain %A Alon Orlitsky %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-jain20a %I PMLR %P 4651--4660 %U https://proceedings.mlr.press/v119/jain20a.html %V 119 %X Many applications, including natural language processing, sensor networks, collaborative filtering, and federated learning, call for estimating discrete distributions from data collected in batches, some of which may be untrustworthy, erroneous, faulty, or even adversarial. Previous estimators for this setting ran in exponential time, and for some regimes required a suboptimal number of batches. We provide the first polynomial-time estimator that is optimal in the number of batches and achieves essentially the best possible estimation accuracy.
APA
Jain, A. & Orlitsky, A.. (2020). Optimal Robust Learning of Discrete Distributions from Batches. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:4651-4660 Available from https://proceedings.mlr.press/v119/jain20a.html.

Related Material