SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation

Wuxinlin Cheng, Chenhui Deng, Zhiqiang Zhao, Yaohui Cai, Zhiru Zhang, Zhuo Feng
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1814-1824, 2021.

Abstract

A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model. Our approach, named SPADE, exploits bijective distance mapping between the input/output graphs constructed for approximating the manifolds corresponding to the input/output data. By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. To reveal the most non-robust data samples highly vulnerable to adversarial attacks, we develop a spectral graph embedding procedure leveraging dominant generalized eigenvectors. This embedding step allows assigning each data point a robustness score that can be further harnessed for more effective adversarial training of ML models. Our experiments show promising empirical results for neural networks trained with the MNIST and CIFAR-10 data sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-cheng21a, title = {SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation}, author = {Cheng, Wuxinlin and Deng, Chenhui and Zhao, Zhiqiang and Cai, Yaohui and Zhang, Zhiru and Feng, Zhuo}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {1814--1824}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/cheng21a/cheng21a.pdf}, url = {https://proceedings.mlr.press/v139/cheng21a.html}, abstract = {A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model. Our approach, named SPADE, exploits bijective distance mapping between the input/output graphs constructed for approximating the manifolds corresponding to the input/output data. By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. To reveal the most non-robust data samples highly vulnerable to adversarial attacks, we develop a spectral graph embedding procedure leveraging dominant generalized eigenvectors. This embedding step allows assigning each data point a robustness score that can be further harnessed for more effective adversarial training of ML models. Our experiments show promising empirical results for neural networks trained with the MNIST and CIFAR-10 data sets.} }
Endnote
%0 Conference Paper %T SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation %A Wuxinlin Cheng %A Chenhui Deng %A Zhiqiang Zhao %A Yaohui Cai %A Zhiru Zhang %A Zhuo Feng %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-cheng21a %I PMLR %P 1814--1824 %U https://proceedings.mlr.press/v139/cheng21a.html %V 139 %X A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model. Our approach, named SPADE, exploits bijective distance mapping between the input/output graphs constructed for approximating the manifolds corresponding to the input/output data. By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. To reveal the most non-robust data samples highly vulnerable to adversarial attacks, we develop a spectral graph embedding procedure leveraging dominant generalized eigenvectors. This embedding step allows assigning each data point a robustness score that can be further harnessed for more effective adversarial training of ML models. Our experiments show promising empirical results for neural networks trained with the MNIST and CIFAR-10 data sets.
APA
Cheng, W., Deng, C., Zhao, Z., Cai, Y., Zhang, Z. & Feng, Z.. (2021). SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1814-1824 Available from https://proceedings.mlr.press/v139/cheng21a.html.

Related Material