[edit]
Optimal Zero-Shot Detector for Multi-Armed Attacks
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:2467-2475, 2024.
Abstract
This research delves into a scenario where a malicious actor can manipulate data samples using a multi-armed attack strategy, providing them with multiple ways to introduce noise into the data sample. Our central objective is to protect the data by detecting any alterations to the input. We approach this defensive strategy with utmost caution, operating in an environment where the defender possesses significantly less information compared to the attacker. Specifically, the defender is unable to utilize any data samples for training a defense model or verifying the integrity of the channel. Instead, the defender relies exclusively on a set of pre-existing detectors readily available "off the shelf." To tackle this challenge, we derive an innovative information-theoretic defense approach that optimally aggregates the decisions made by these detectors, eliminating the need for any training data. We further explore a practical use-case scenario for empirical evaluation, where the attacker possesses a pre-trained classifier and launches well-known adversarial attacks against it. Our experiments highlight the effectiveness of our proposed solution, even in scenarios that deviate from the optimal setup.