Explaining Latent Representations of Neural Networks with Archetypal Analysis

Anna Emilie Jennow Wedenborg, Teresa Dorszewski, Lars Kai Hansen, Kristoffer Knutsen Wickstrøm, Morten Mørup
Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), PMLR 307:448-468, 2026.

Abstract

We apply Archetypal Analysis to the latent spaces of trained neural networks, offering interpretable explanations of feature representations of neural networks without relying on user-defined corpora. Through layer-wise analyses of convolutional networks and vision transformers across multiple classification tasks, we demonstrate that archetypes are robust, dataset-independent, and provide intuitive insights into how models encode and transform information from layer to layer. Our approach enables global insights by characterizing the unique structure of the latent representation space of each layer, while also offering localized explanations of individual decisions as convex combinations of extreme points (i.e., archetypes).

Cite this Paper


BibTeX
@InProceedings{pmlr-v307-wedenborg26a, title = {Explaining Latent Representations of Neural Networks with Archetypal Analysis}, author = {Wedenborg, Anna Emilie Jennow and Dorszewski, Teresa and Hansen, Lars Kai and Wickstr{\o}m, Kristoffer Knutsen and M{\o}rup, Morten}, booktitle = {Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL)}, pages = {448--468}, year = {2026}, editor = {Kim, Hyeongji and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {307}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v307/main/assets/wedenborg26a/wedenborg26a.pdf}, url = {https://proceedings.mlr.press/v307/wedenborg26a.html}, abstract = {We apply Archetypal Analysis to the latent spaces of trained neural networks, offering interpretable explanations of feature representations of neural networks without relying on user-defined corpora. Through layer-wise analyses of convolutional networks and vision transformers across multiple classification tasks, we demonstrate that archetypes are robust, dataset-independent, and provide intuitive insights into how models encode and transform information from layer to layer. Our approach enables global insights by characterizing the unique structure of the latent representation space of each layer, while also offering localized explanations of individual decisions as convex combinations of extreme points (i.e., archetypes).} }
Endnote
%0 Conference Paper %T Explaining Latent Representations of Neural Networks with Archetypal Analysis %A Anna Emilie Jennow Wedenborg %A Teresa Dorszewski %A Lars Kai Hansen %A Kristoffer Knutsen Wickstrøm %A Morten Mørup %B Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL) %C Proceedings of Machine Learning Research %D 2026 %E Hyeongji Kim %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v307-wedenborg26a %I PMLR %P 448--468 %U https://proceedings.mlr.press/v307/wedenborg26a.html %V 307 %X We apply Archetypal Analysis to the latent spaces of trained neural networks, offering interpretable explanations of feature representations of neural networks without relying on user-defined corpora. Through layer-wise analyses of convolutional networks and vision transformers across multiple classification tasks, we demonstrate that archetypes are robust, dataset-independent, and provide intuitive insights into how models encode and transform information from layer to layer. Our approach enables global insights by characterizing the unique structure of the latent representation space of each layer, while also offering localized explanations of individual decisions as convex combinations of extreme points (i.e., archetypes).
APA
Wedenborg, A.E.J., Dorszewski, T., Hansen, L.K., Wickstrøm, K.K. & Mørup, M.. (2026). Explaining Latent Representations of Neural Networks with Archetypal Analysis. Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 307:448-468 Available from https://proceedings.mlr.press/v307/wedenborg26a.html.

Related Material