[edit]
Statistically robust neural network classification
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:1735-1745, 2021.
Abstract
Despite their numerous successes, there are many scenarios where adversarial risk metrics do not provide an appropriate measure of robustness. For example, test-time perturbations may occur in a probabilistic manner rather than being generated by an explicit adversary, while the poor train–test generalization of adversarial metrics can limit their usage to simple problems. Motivated by this, we develop a probabilistic robust risk framework, the statistically robust risk (SRR), which considers pointwise corruption distributions, as opposed to worst-case adversaries. The SRR provides a distinct and complementary measure of robust performance, compared to natural and adversarial risk. We show that the SRR admits estimation and training schemes which are as simple and efficient as for the natural risk: these simply require noising the inputs, but with a principled derivation for exactly how and why this should be done. Furthermore, we demonstrate both theoretically and experimentally that it can provide superior generalization performance compared with adversarial risks, enabling application to high-dimensional datasets.