Minimax M-estimation under Adversarial Contamination

Sujay Bhatt; Guanhua Fang; Ping Li; Gennady Samorodnitsky

Minimax M-estimation under Adversarial Contamination

Sujay Bhatt, Guanhua Fang, Ping Li, Gennady Samorodnitsky

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:1906-1924, 2022.

Abstract

We present a new finite-sample analysis of Catoni’s M-estimator under adversarial contamination, where an adversary is allowed to corrupt a fraction of the samples arbitrarily. We make minimal assumptions on the distribution of the uncontaminated random variables, namely, we only assume the existence of a known upper bound

$\upsilon_{\varepsilon} > 0$ on the

$(1+\varepsilon)^{th}$ central moment of the random variables, namely, for

$\varepsilon \in (0,1]$

$\mathbb{E}_{X_1 \sim \mathcal{D}} \Big| X_1 - \mu \Big|^{1+\varepsilon} \leq \upsilon_{\varepsilon}.$ {We} provide a lower bound on the minimax error rate for the mean estimation problem under adversarial corruption under this weak assumption, and establish that the proposed M-estimator achieves this lower bound (up to multiplicative constants). When the variance is infinite, the tolerance to contamination of any estimator reduces as

$\varepsilon \downarrow 0$ . We establish a tight upper bound that characterizes this bargain. To illustrate the usefulness of the derived robust M-estimator in an online setting, we present a bandit algorithm for the partially identifiable best arm identification problem that improves upon the sample complexity of the state of the art algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-bhatt22a,
  title = 	 {Minimax M-estimation under Adversarial Contamination},
  author =       {Bhatt, Sujay and Fang, Guanhua and Li, Ping and Samorodnitsky, Gennady},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {1906--1924},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/bhatt22a/bhatt22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/bhatt22a.html},
  abstract = 	 {We present a new finite-sample analysis of Catoni’s M-estimator under adversarial contamination, where an adversary is allowed to corrupt a fraction of the samples arbitrarily. We make minimal assumptions on the distribution of the uncontaminated random variables, namely, we only assume the existence of a known upper bound $\upsilon_{\varepsilon} > 0$ on the $(1+\varepsilon)^{th}$ central moment of the random variables, namely, for $\varepsilon \in (0,1]$ \[ \mathbb{E}_{X_1 \sim \mathcal{D}} \Big| X_1 - \mu \Big|^{1+\varepsilon} \leq \upsilon_{\varepsilon}. \]{We} provide a lower bound on the minimax error rate for the mean estimation problem under adversarial corruption under this weak assumption, and establish that the proposed M-estimator achieves this lower bound (up to multiplicative constants). When the variance is infinite, the tolerance to contamination of any estimator reduces as $\varepsilon \downarrow 0$. We establish a tight upper bound that characterizes this bargain. To illustrate the usefulness of the derived robust M-estimator in an online setting, we present a bandit algorithm for the partially identifiable best arm identification problem that improves upon the sample complexity of the state of the art algorithms.}
}

Endnote

%0 Conference Paper
%T Minimax M-estimation under Adversarial Contamination
%A Sujay Bhatt
%A Guanhua Fang
%A Ping Li
%A Gennady Samorodnitsky
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-bhatt22a
%I PMLR
%P 1906--1924
%U https://proceedings.mlr.press/v162/bhatt22a.html
%V 162
%X We present a new finite-sample analysis of Catoni’s M-estimator under adversarial contamination, where an adversary is allowed to corrupt a fraction of the samples arbitrarily. We make minimal assumptions on the distribution of the uncontaminated random variables, namely, we only assume the existence of a known upper bound $\upsilon_{\varepsilon} > 0$ on the $(1+\varepsilon)^{th}$ central moment of the random variables, namely, for $\varepsilon \in (0,1]$ \[ \mathbb{E}_{X_1 \sim \mathcal{D}} \Big| X_1 - \mu \Big|^{1+\varepsilon} \leq \upsilon_{\varepsilon}. \]{We} provide a lower bound on the minimax error rate for the mean estimation problem under adversarial corruption under this weak assumption, and establish that the proposed M-estimator achieves this lower bound (up to multiplicative constants). When the variance is infinite, the tolerance to contamination of any estimator reduces as $\varepsilon \downarrow 0$. We establish a tight upper bound that characterizes this bargain. To illustrate the usefulness of the derived robust M-estimator in an online setting, we present a bandit algorithm for the partially identifiable best arm identification problem that improves upon the sample complexity of the state of the art algorithms.

APA


Bhatt, S., Fang, G., Li, P. & Samorodnitsky, G.. (2022). Minimax M-estimation under Adversarial Contamination. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:1906-1924 Available from https://proceedings.mlr.press/v162/bhatt22a.html.

Related Material

Download PDF