Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

Sicheng Zhu; Xiao Zhang; David Evans

Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

Sicheng Zhu, Xiao Zhang, David Evans

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:11609-11618, 2020.

Abstract

Training machine learning models that are robust against adversarial inputs poses seemingly insurmountable challenges. To better understand adversarial robustness, we consider the underlying problem of learning robust representations. We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the worst-case input perturbation. Then, we prove a theorem that establishes a lower bound on the minimum adversarial risk that can be achieved for any downstream classifier based on its representation vulnerability. We propose an unsupervised learning method for obtaining intrinsically robust representations by maximizing the worst-case mutual information between the input and output distributions. Experiments on downstream classification tasks support the robustness of the representations found using unsupervised learning with our training principle.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-zhu20e,
  title = 	 {Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization},
  author =       {Zhu, Sicheng and Zhang, Xiao and Evans, David},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {11609--11618},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/zhu20e/zhu20e.pdf},
  url = 	 {https://proceedings.mlr.press/v119/zhu20e.html},
  abstract = 	 {Training machine learning models that are robust against adversarial inputs poses seemingly insurmountable challenges. To better understand adversarial robustness, we consider the underlying problem of learning robust representations. We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the worst-case input perturbation. Then, we prove a theorem that establishes a lower bound on the minimum adversarial risk that can be achieved for any downstream classifier based on its representation vulnerability. We propose an unsupervised learning method for obtaining intrinsically robust representations by maximizing the worst-case mutual information between the input and output distributions. Experiments on downstream classification tasks support the robustness of the representations found using unsupervised learning with our training principle.}
}

Endnote

%0 Conference Paper
%T Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization
%A Sicheng Zhu
%A Xiao Zhang
%A David Evans
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-zhu20e
%I PMLR
%P 11609--11618
%U https://proceedings.mlr.press/v119/zhu20e.html
%V 119
%X Training machine learning models that are robust against adversarial inputs poses seemingly insurmountable challenges. To better understand adversarial robustness, we consider the underlying problem of learning robust representations. We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the worst-case input perturbation. Then, we prove a theorem that establishes a lower bound on the minimum adversarial risk that can be achieved for any downstream classifier based on its representation vulnerability. We propose an unsupervised learning method for obtaining intrinsically robust representations by maximizing the worst-case mutual information between the input and output distributions. Experiments on downstream classification tasks support the robustness of the representations found using unsupervised learning with our training principle.

APA

Zhu, S., Zhang, X. & Evans, D.. (2020). Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:11609-11618 Available from https://proceedings.mlr.press/v119/zhu20e.html.

Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

Abstract

Cite this Paper

Related Material