Non-vacuous Generalization Bounds for Adversarial Risk in Stochastic Neural Networks

Waleed Mustafa, Philipp Liznerski, Antoine Ledent, Dennis Wagner, Puyu Wang, Marius Kloft
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4528-4536, 2024.

Abstract

Adversarial examples are manipulated samples used to deceive machine learning models, posing a serious threat in safety-critical applications. Existing safety certificates for machine learning models are limited to individual input examples, failing to capture generalization to unseen data. To address this limitation, we propose novel generalization bounds based on the PAC-Bayesian and randomized smoothing frameworks, providing certificates that predict the model’s performance and robustness on unseen test samples based solely on the training data. We present an effective procedure to train and compute the first non-vacuous generalization bounds for neural networks in adversarial settings. Experimental results on the widely recognized MNIST and CIFAR-10 datasets demonstrate the efficacy of our approach, yielding the first robust risk certificates for stochastic convolutional neural networks under the $L_2$ threat model. Our method offers valuable tools for evaluating model susceptibility to real-world adversarial risks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-mustafa24a, title = {Non-vacuous Generalization Bounds for Adversarial Risk in Stochastic Neural Networks}, author = {Mustafa, Waleed and Liznerski, Philipp and Ledent, Antoine and Wagner, Dennis and Wang, Puyu and Kloft, Marius}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4528--4536}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/mustafa24a/mustafa24a.pdf}, url = {https://proceedings.mlr.press/v238/mustafa24a.html}, abstract = {Adversarial examples are manipulated samples used to deceive machine learning models, posing a serious threat in safety-critical applications. Existing safety certificates for machine learning models are limited to individual input examples, failing to capture generalization to unseen data. To address this limitation, we propose novel generalization bounds based on the PAC-Bayesian and randomized smoothing frameworks, providing certificates that predict the model’s performance and robustness on unseen test samples based solely on the training data. We present an effective procedure to train and compute the first non-vacuous generalization bounds for neural networks in adversarial settings. Experimental results on the widely recognized MNIST and CIFAR-10 datasets demonstrate the efficacy of our approach, yielding the first robust risk certificates for stochastic convolutional neural networks under the $L_2$ threat model. Our method offers valuable tools for evaluating model susceptibility to real-world adversarial risks.} }
Endnote
%0 Conference Paper %T Non-vacuous Generalization Bounds for Adversarial Risk in Stochastic Neural Networks %A Waleed Mustafa %A Philipp Liznerski %A Antoine Ledent %A Dennis Wagner %A Puyu Wang %A Marius Kloft %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-mustafa24a %I PMLR %P 4528--4536 %U https://proceedings.mlr.press/v238/mustafa24a.html %V 238 %X Adversarial examples are manipulated samples used to deceive machine learning models, posing a serious threat in safety-critical applications. Existing safety certificates for machine learning models are limited to individual input examples, failing to capture generalization to unseen data. To address this limitation, we propose novel generalization bounds based on the PAC-Bayesian and randomized smoothing frameworks, providing certificates that predict the model’s performance and robustness on unseen test samples based solely on the training data. We present an effective procedure to train and compute the first non-vacuous generalization bounds for neural networks in adversarial settings. Experimental results on the widely recognized MNIST and CIFAR-10 datasets demonstrate the efficacy of our approach, yielding the first robust risk certificates for stochastic convolutional neural networks under the $L_2$ threat model. Our method offers valuable tools for evaluating model susceptibility to real-world adversarial risks.
APA
Mustafa, W., Liznerski, P., Ledent, A., Wagner, D., Wang, P. & Kloft, M.. (2024). Non-vacuous Generalization Bounds for Adversarial Risk in Stochastic Neural Networks. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4528-4536 Available from https://proceedings.mlr.press/v238/mustafa24a.html.

Related Material