[edit]
Better Diffusion Models Further Improve Adversarial Training
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:36246-36263, 2023.
Abstract
It has been recognized that the data generated by the denoising diffusion probabilistic model (DDPM) improves adversarial training. After two years of rapid development in diffusion models, a question naturally arises: can better diffusion models further improve adversarial training? This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency ($\sim 20$ sampling steps) and image quality (lower FID score) compared with DDPM. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data (no external datasets). Under the $\ell_\infty$-norm threat model with $\epsilon=8/255$, our models achieve $70.69\\%$ and $42.67\\%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i.e. improving upon previous state-of-the-art models by $+4.58\\%$ and $+8.03\\%$. Under the $\ell_2$-norm threat model with $\epsilon=128/255$, our models achieve $84.86\\%$ on CIFAR-10 ($+4.44\\%$). These results also beat previous works that use external data. We also provide compelling results on the SVHN and TinyImageNet datasets. Our code is at https://github.com/wzekai99/DM-Improves-AT.