Does a Neural Network Really Encode Symbolic Concepts?

Mingjie Li; Quanshi Zhang

Does a Neural Network Really Encode Symbolic Concepts?

Mingjie Li, Quanshi Zhang

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:20452-20469, 2023.

Abstract

Recently, a series of studies have tried to extract interactions between input variables modeled by a DNN and define such interactions as concepts encoded by the DNN. However, strictly speaking, there still lacks a solid guarantee whether such interactions indeed represent meaningful concepts. Therefore, in this paper, we examine the trustworthiness of interaction concepts from four perspectives. Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts, which is partially aligned with human intuition. The code is released at https://github.com/sjtu-xai-lab/interaction-concept.

Cite this Paper

BibTeX

@InProceedings{pmlr-v202-li23at,
  title = 	 {Does a Neural Network Really Encode Symbolic Concepts?},
  author =       {Li, Mingjie and Zhang, Quanshi},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {20452--20469},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/li23at/li23at.pdf},
  url = 	 {https://proceedings.mlr.press/v202/li23at.html},
  abstract = 	 {Recently, a series of studies have tried to extract interactions between input variables modeled by a DNN and define such interactions as concepts encoded by the DNN. However, strictly speaking, there still lacks a solid guarantee whether such interactions indeed represent meaningful concepts. Therefore, in this paper, we examine the trustworthiness of interaction concepts from four perspectives. Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts, which is partially aligned with human intuition. The code is released at https://github.com/sjtu-xai-lab/interaction-concept.}
}

Endnote

%0 Conference Paper
%T Does a Neural Network Really Encode Symbolic Concepts?
%A Mingjie Li
%A Quanshi Zhang
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-li23at
%I PMLR
%P 20452--20469
%U https://proceedings.mlr.press/v202/li23at.html
%V 202
%X Recently, a series of studies have tried to extract interactions between input variables modeled by a DNN and define such interactions as concepts encoded by the DNN. However, strictly speaking, there still lacks a solid guarantee whether such interactions indeed represent meaningful concepts. Therefore, in this paper, we examine the trustworthiness of interaction concepts from four perspectives. Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts, which is partially aligned with human intuition. The code is released at https://github.com/sjtu-xai-lab/interaction-concept.

APA

Li, M. & Zhang, Q.. (2023). Does a Neural Network Really Encode Symbolic Concepts?. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:20452-20469 Available from https://proceedings.mlr.press/v202/li23at.html.

Does a Neural Network Really Encode Symbolic Concepts?

Abstract

Cite this Paper

Related Material