Identifying Metric Structures of Deep Latent Variable Models

Stas Syrota, Yevgen Zainchkovskyy, Johnny Xi, Benjamin Bloem-Reddy, Søren Hauberg
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:58065-58087, 2025.

Abstract

Deep latent variable models learn condensed representations of data that, hopefully, reflect the inner workings of the studied phenomena. Unfortunately, these latent representations are not statistically identifiable, meaning they cannot be uniquely determined. Domain experts, therefore, need to tread carefully when interpreting these. Current solutions limit the lack of identifiability through additional constraints on the latent variable model, e.g. by requiring labeled training data, or by restricting the expressivity of the model. We change the goal: instead of identifying the latent variables, we identify relationships between them such as meaningful distances, angles, and volumes. We prove this is feasible under very mild model conditions and without additional labeled data. We empirically demonstrate that our theory results in more reliable latent distances, offering a principled path forward in extracting trustworthy conclusions from deep latent variable models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-syrota25a, title = {Identifying Metric Structures of Deep Latent Variable Models}, author = {Syrota, Stas and Zainchkovskyy, Yevgen and Xi, Johnny and Bloem-Reddy, Benjamin and Hauberg, S{\o}ren}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {58065--58087}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/syrota25a/syrota25a.pdf}, url = {https://proceedings.mlr.press/v267/syrota25a.html}, abstract = {Deep latent variable models learn condensed representations of data that, hopefully, reflect the inner workings of the studied phenomena. Unfortunately, these latent representations are not statistically identifiable, meaning they cannot be uniquely determined. Domain experts, therefore, need to tread carefully when interpreting these. Current solutions limit the lack of identifiability through additional constraints on the latent variable model, e.g. by requiring labeled training data, or by restricting the expressivity of the model. We change the goal: instead of identifying the latent variables, we identify relationships between them such as meaningful distances, angles, and volumes. We prove this is feasible under very mild model conditions and without additional labeled data. We empirically demonstrate that our theory results in more reliable latent distances, offering a principled path forward in extracting trustworthy conclusions from deep latent variable models.} }
Endnote
%0 Conference Paper %T Identifying Metric Structures of Deep Latent Variable Models %A Stas Syrota %A Yevgen Zainchkovskyy %A Johnny Xi %A Benjamin Bloem-Reddy %A Søren Hauberg %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-syrota25a %I PMLR %P 58065--58087 %U https://proceedings.mlr.press/v267/syrota25a.html %V 267 %X Deep latent variable models learn condensed representations of data that, hopefully, reflect the inner workings of the studied phenomena. Unfortunately, these latent representations are not statistically identifiable, meaning they cannot be uniquely determined. Domain experts, therefore, need to tread carefully when interpreting these. Current solutions limit the lack of identifiability through additional constraints on the latent variable model, e.g. by requiring labeled training data, or by restricting the expressivity of the model. We change the goal: instead of identifying the latent variables, we identify relationships between them such as meaningful distances, angles, and volumes. We prove this is feasible under very mild model conditions and without additional labeled data. We empirically demonstrate that our theory results in more reliable latent distances, offering a principled path forward in extracting trustworthy conclusions from deep latent variable models.
APA
Syrota, S., Zainchkovskyy, Y., Xi, J., Bloem-Reddy, B. & Hauberg, S.. (2025). Identifying Metric Structures of Deep Latent Variable Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:58065-58087 Available from https://proceedings.mlr.press/v267/syrota25a.html.

Related Material