[edit]
DENL: Diverse Ensemble and Noisy Logits for Improved Robustness of Neural Networks
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:1574-1589, 2024.
Abstract
Neural Networks (NN) are increasingly used for image classification in medical, transportation, and security devices. However, recent studies have revealed neural networks’ vulnerability against adversarial examples generated by adding small perturbations to images. These malicious samples are imperceptible by human eyes, but can give rise to misclassification by NN models. Defensive distillation is a defence mechanism in which the NN’s output probabilities are scaled to a user-defined range and used as labels to train a new model less sensitive to input perturbations. Despite initial success, defensive distillation was defeated by state-of-the-art attacks. A proposed countermeasure was to add noise in the inference time to hamper the adversarial attack which also decreased the model accuracy. In this paper, we address this limitation by proposing a two-phase training methodology to defend against adversarial attacks. In the first phase, we train architecturally diversified models individually using the cross-entropy loss function. In the second phase, we train the ensemble using a diversity-promoting loss function. Our experimental results show that our training methodology and noise addition in the inference time improved our ensemble’s resistance against adversarial attacks, while maintaining reasonable accuracy, compared to the state-of-the-art methods.