Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense

Bao Gia Doan, Ehsan M. Abbasnejad, Javen Qinfeng Shi, Damith C. Ranasinghe
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:5309-5323, 2022.

Abstract

We present a new algorithm to learn a deep neural network model robust against adversarial attacks. Previous algorithms demonstrate an adversarially trained Bayesian Neural Network (BNN) provides improved robustness. We recognize the learning approach for approximating the multi-modal posterior distribution of an adversarially trained Bayesian model can lead to mode collapse; consequently, the model’s achievements in robustness and performance are sub-optimal. Instead, we first propose preventing mode collapse to better approximate the multi-modal posterior distribution. Second, based on the intuition that a robust model should ignore perturbations and only consider the informative content of the input, we conceptualize and formulate an information gain objective to measure and force the information learned from both benign and adversarial training instances to be similar. Importantly. we prove and demonstrate that minimizing the information gain objective allows the adversarial risk to approach the conventional empirical risk. We believe our efforts provide a step towards a basis for a principled method of adversarially training BNNs. Our extensive experimental results demonstrate significantly improved robustness up to 20% compared with adversarial training and Adv-BNN under PGD attacks with 0.035 distortion on both CIFAR-10 and STL-10 dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-doan22a, title = {{B}ayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense}, author = {Doan, Bao Gia and Abbasnejad, Ehsan M. and Shi, Javen Qinfeng and Ranasinghe C., Damith}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {5309--5323}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/doan22a/doan22a.pdf}, url = {https://proceedings.mlr.press/v162/doan22a.html}, abstract = {We present a new algorithm to learn a deep neural network model robust against adversarial attacks. Previous algorithms demonstrate an adversarially trained Bayesian Neural Network (BNN) provides improved robustness. We recognize the learning approach for approximating the multi-modal posterior distribution of an adversarially trained Bayesian model can lead to mode collapse; consequently, the model’s achievements in robustness and performance are sub-optimal. Instead, we first propose preventing mode collapse to better approximate the multi-modal posterior distribution. Second, based on the intuition that a robust model should ignore perturbations and only consider the informative content of the input, we conceptualize and formulate an information gain objective to measure and force the information learned from both benign and adversarial training instances to be similar. Importantly. we prove and demonstrate that minimizing the information gain objective allows the adversarial risk to approach the conventional empirical risk. We believe our efforts provide a step towards a basis for a principled method of adversarially training BNNs. Our extensive experimental results demonstrate significantly improved robustness up to 20% compared with adversarial training and Adv-BNN under PGD attacks with 0.035 distortion on both CIFAR-10 and STL-10 dataset.} }
Endnote
%0 Conference Paper %T Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense %A Bao Gia Doan %A Ehsan M. Abbasnejad %A Javen Qinfeng Shi %A Damith C. Ranasinghe %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-doan22a %I PMLR %P 5309--5323 %U https://proceedings.mlr.press/v162/doan22a.html %V 162 %X We present a new algorithm to learn a deep neural network model robust against adversarial attacks. Previous algorithms demonstrate an adversarially trained Bayesian Neural Network (BNN) provides improved robustness. We recognize the learning approach for approximating the multi-modal posterior distribution of an adversarially trained Bayesian model can lead to mode collapse; consequently, the model’s achievements in robustness and performance are sub-optimal. Instead, we first propose preventing mode collapse to better approximate the multi-modal posterior distribution. Second, based on the intuition that a robust model should ignore perturbations and only consider the informative content of the input, we conceptualize and formulate an information gain objective to measure and force the information learned from both benign and adversarial training instances to be similar. Importantly. we prove and demonstrate that minimizing the information gain objective allows the adversarial risk to approach the conventional empirical risk. We believe our efforts provide a step towards a basis for a principled method of adversarially training BNNs. Our extensive experimental results demonstrate significantly improved robustness up to 20% compared with adversarial training and Adv-BNN under PGD attacks with 0.035 distortion on both CIFAR-10 and STL-10 dataset.
APA
Doan, B.G., Abbasnejad, E.M., Shi, J.Q. & Ranasinghe, D.C.. (2022). Bayesian Learning with Information Gain Provably Bounds Risk for a Robust Adversarial Defense. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:5309-5323 Available from https://proceedings.mlr.press/v162/doan22a.html.

Related Material