[edit]
Better Diffusion Models Further Improve Adversarial Training
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:36246-36263, 2023.
Abstract
It has been recognized that the data generated by the denoising diffusion probabilistic model (DDPM) improves adversarial training. After two years of rapid development in diffusion models, a question naturally arises: can better diffusion models further improve adversarial training? This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency (∼20 sampling steps) and image quality (lower FID score) compared with DDPM. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data (no external datasets). Under the ℓ∞-norm threat model with ϵ=8/255, our models achieve 70.69 and 42.67 robust accuracy on CIFAR-10 and CIFAR-100, respectively, i.e. improving upon previous state-of-the-art models by +4.58 and +8.03. Under the ℓ2-norm threat model with ϵ=128/255, our models achieve 84.86 on CIFAR-10 (+4.44). These results also beat previous works that use external data. We also provide compelling results on the SVHN and TinyImageNet datasets. Our code is at https://github.com/wzekai99/DM-Improves-AT.