On the Impact of Knowledge Distillation for Model Interpretability

Hyeongrok Han, Siwon Kim, Hyun-Soo Choi, Sungroh Yoon
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:12389-12410, 2023.

Abstract

Several recent studies have elucidated why knowledge distillation (KD) improves model performance. However, few have researched the other advantages of KD in addition to its improving model performance. In this study, we have attempted to show that KD enhances the interpretability as well as the accuracy of models. We measured the number of concept detectors identified in network dissection for a quantitative comparison of model interpretability. We attributed the improvement in interpretability to the class-similarity information transferred from the teacher to student models. First, we confirmed the transfer of class-similarity information from the teacher to student model via logit distillation. Then, we analyzed how class-similarity information affects model interpretability in terms of its presence or absence and degree of similarity information. We conducted various quantitative and qualitative experiments and examined the results on different datasets, different KD methods, and according to different measures of interpretability. Our research showed that KD models by large models could be used more reliably in various fields. The code is available at https://github.com/Rok07/KD_XAI.git.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-han23b, title = {On the Impact of Knowledge Distillation for Model Interpretability}, author = {Han, Hyeongrok and Kim, Siwon and Choi, Hyun-Soo and Yoon, Sungroh}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {12389--12410}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/han23b/han23b.pdf}, url = {https://proceedings.mlr.press/v202/han23b.html}, abstract = {Several recent studies have elucidated why knowledge distillation (KD) improves model performance. However, few have researched the other advantages of KD in addition to its improving model performance. In this study, we have attempted to show that KD enhances the interpretability as well as the accuracy of models. We measured the number of concept detectors identified in network dissection for a quantitative comparison of model interpretability. We attributed the improvement in interpretability to the class-similarity information transferred from the teacher to student models. First, we confirmed the transfer of class-similarity information from the teacher to student model via logit distillation. Then, we analyzed how class-similarity information affects model interpretability in terms of its presence or absence and degree of similarity information. We conducted various quantitative and qualitative experiments and examined the results on different datasets, different KD methods, and according to different measures of interpretability. Our research showed that KD models by large models could be used more reliably in various fields. The code is available at https://github.com/Rok07/KD_XAI.git.} }
Endnote
%0 Conference Paper %T On the Impact of Knowledge Distillation for Model Interpretability %A Hyeongrok Han %A Siwon Kim %A Hyun-Soo Choi %A Sungroh Yoon %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-han23b %I PMLR %P 12389--12410 %U https://proceedings.mlr.press/v202/han23b.html %V 202 %X Several recent studies have elucidated why knowledge distillation (KD) improves model performance. However, few have researched the other advantages of KD in addition to its improving model performance. In this study, we have attempted to show that KD enhances the interpretability as well as the accuracy of models. We measured the number of concept detectors identified in network dissection for a quantitative comparison of model interpretability. We attributed the improvement in interpretability to the class-similarity information transferred from the teacher to student models. First, we confirmed the transfer of class-similarity information from the teacher to student model via logit distillation. Then, we analyzed how class-similarity information affects model interpretability in terms of its presence or absence and degree of similarity information. We conducted various quantitative and qualitative experiments and examined the results on different datasets, different KD methods, and according to different measures of interpretability. Our research showed that KD models by large models could be used more reliably in various fields. The code is available at https://github.com/Rok07/KD_XAI.git.
APA
Han, H., Kim, S., Choi, H. & Yoon, S.. (2023). On the Impact of Knowledge Distillation for Model Interpretability. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:12389-12410 Available from https://proceedings.mlr.press/v202/han23b.html.

Related Material