Robust Distribution Learning with Local and Global Adversarial Corruptions (extended abstract)

Sloan Nietert, Ziv Goldfeld, Soroosh Shafiee
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:4007-4008, 2024.

Abstract

We consider learning in an adversarial environment, where an $\varepsilon$-fraction of samples from a distribution $P$ are arbitrarily modified (\emph{global} corruptions) and the remaining perturbations have average magnitude bounded by $\rho$ (\emph{local} corruptions). Given access to $n$ such corrupted samples, we seek a computationally efficient estimator $\hat{P}_n$ that minimizes the Wasserstein distance $W_1(\hat{P}_n,P)$. In fact, we attack the fine-grained task of minimizing $W_1(\Pi_\sharp \hat{P}_n, \Pi_\sharp P)$ for all orthogonal projections $\Pi \in \mathbb{R}^{d \times d}$, with performance scaling with $\mathrm{rank}(\Pi) = k$. This allows us to account simultaneously for mean estimation ($k=1$), distribution estimation ($k=d$), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilon k} + \rho + \tilde{O}(k\sqrt{d}n^{-1/k})$ when $P$ has bounded covariance. Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-nietert24a, title = {Robust Distribution Learning with Local and Global Adversarial Corruptions (extended abstract)}, author = {Nietert, Sloan and Goldfeld, Ziv and Shafiee, Soroosh}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {4007--4008}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/nietert24a/nietert24a.pdf}, url = {https://proceedings.mlr.press/v247/nietert24a.html}, abstract = {We consider learning in an adversarial environment, where an $\varepsilon$-fraction of samples from a distribution $P$ are arbitrarily modified (\emph{global} corruptions) and the remaining perturbations have average magnitude bounded by $\rho$ (\emph{local} corruptions). Given access to $n$ such corrupted samples, we seek a computationally efficient estimator $\hat{P}_n$ that minimizes the Wasserstein distance $W_1(\hat{P}_n,P)$. In fact, we attack the fine-grained task of minimizing $W_1(\Pi_\sharp \hat{P}_n, \Pi_\sharp P)$ for all orthogonal projections $\Pi \in \mathbb{R}^{d \times d}$, with performance scaling with $\mathrm{rank}(\Pi) = k$. This allows us to account simultaneously for mean estimation ($k=1$), distribution estimation ($k=d$), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilon k} + \rho + \tilde{O}(k\sqrt{d}n^{-1/k})$ when $P$ has bounded covariance. Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.} }
Endnote
%0 Conference Paper %T Robust Distribution Learning with Local and Global Adversarial Corruptions (extended abstract) %A Sloan Nietert %A Ziv Goldfeld %A Soroosh Shafiee %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-nietert24a %I PMLR %P 4007--4008 %U https://proceedings.mlr.press/v247/nietert24a.html %V 247 %X We consider learning in an adversarial environment, where an $\varepsilon$-fraction of samples from a distribution $P$ are arbitrarily modified (\emph{global} corruptions) and the remaining perturbations have average magnitude bounded by $\rho$ (\emph{local} corruptions). Given access to $n$ such corrupted samples, we seek a computationally efficient estimator $\hat{P}_n$ that minimizes the Wasserstein distance $W_1(\hat{P}_n,P)$. In fact, we attack the fine-grained task of minimizing $W_1(\Pi_\sharp \hat{P}_n, \Pi_\sharp P)$ for all orthogonal projections $\Pi \in \mathbb{R}^{d \times d}$, with performance scaling with $\mathrm{rank}(\Pi) = k$. This allows us to account simultaneously for mean estimation ($k=1$), distribution estimation ($k=d$), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilon k} + \rho + \tilde{O}(k\sqrt{d}n^{-1/k})$ when $P$ has bounded covariance. Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.
APA
Nietert, S., Goldfeld, Z. & Shafiee, S.. (2024). Robust Distribution Learning with Local and Global Adversarial Corruptions (extended abstract). Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:4007-4008 Available from https://proceedings.mlr.press/v247/nietert24a.html.

Related Material