Demystifying Inductive Biases for (Beta-)VAE Based Architectures

Dominik Zietlow, Michal Rolinek, Georg Martius
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:12945-12954, 2021.

Abstract

The performance of Beta-Variational-Autoencoders and their variants on learning semantically meaningful, disentangled representations is unparalleled. On the other hand, there are theoretical arguments suggesting the impossibility of unsupervised disentanglement. In this work, we shed light on the inductive bias responsible for the success of VAE-based architectures. We show that in classical datasets the structure of variance, induced by the generating factors, is conveniently aligned with the latent directions fostered by the VAE objective. This builds the pivotal bias on which the disentangling abilities of VAEs rely. By small, elaborate perturbations of existing datasets, we hide the convenient correlation structure that is easily exploited by a variety of architectures. To demonstrate this, we construct modified versions of standard datasets in which (i) the generative factors are perfectly preserved; (ii) each image undergoes a mild transformation causing a small change of variance; (iii) the leading VAE-based disentanglement architectures fail to produce disentangled representations whilst the performance of a non-variational method remains unchanged.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-zietlow21a, title = {Demystifying Inductive Biases for (Beta-)VAE Based Architectures}, author = {Zietlow, Dominik and Rolinek, Michal and Martius, Georg}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {12945--12954}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/zietlow21a/zietlow21a.pdf}, url = {https://proceedings.mlr.press/v139/zietlow21a.html}, abstract = {The performance of Beta-Variational-Autoencoders and their variants on learning semantically meaningful, disentangled representations is unparalleled. On the other hand, there are theoretical arguments suggesting the impossibility of unsupervised disentanglement. In this work, we shed light on the inductive bias responsible for the success of VAE-based architectures. We show that in classical datasets the structure of variance, induced by the generating factors, is conveniently aligned with the latent directions fostered by the VAE objective. This builds the pivotal bias on which the disentangling abilities of VAEs rely. By small, elaborate perturbations of existing datasets, we hide the convenient correlation structure that is easily exploited by a variety of architectures. To demonstrate this, we construct modified versions of standard datasets in which (i) the generative factors are perfectly preserved; (ii) each image undergoes a mild transformation causing a small change of variance; (iii) the leading VAE-based disentanglement architectures fail to produce disentangled representations whilst the performance of a non-variational method remains unchanged.} }
Endnote
%0 Conference Paper %T Demystifying Inductive Biases for (Beta-)VAE Based Architectures %A Dominik Zietlow %A Michal Rolinek %A Georg Martius %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-zietlow21a %I PMLR %P 12945--12954 %U https://proceedings.mlr.press/v139/zietlow21a.html %V 139 %X The performance of Beta-Variational-Autoencoders and their variants on learning semantically meaningful, disentangled representations is unparalleled. On the other hand, there are theoretical arguments suggesting the impossibility of unsupervised disentanglement. In this work, we shed light on the inductive bias responsible for the success of VAE-based architectures. We show that in classical datasets the structure of variance, induced by the generating factors, is conveniently aligned with the latent directions fostered by the VAE objective. This builds the pivotal bias on which the disentangling abilities of VAEs rely. By small, elaborate perturbations of existing datasets, we hide the convenient correlation structure that is easily exploited by a variety of architectures. To demonstrate this, we construct modified versions of standard datasets in which (i) the generative factors are perfectly preserved; (ii) each image undergoes a mild transformation causing a small change of variance; (iii) the leading VAE-based disentanglement architectures fail to produce disentangled representations whilst the performance of a non-variational method remains unchanged.
APA
Zietlow, D., Rolinek, M. & Martius, G.. (2021). Demystifying Inductive Biases for (Beta-)VAE Based Architectures. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:12945-12954 Available from https://proceedings.mlr.press/v139/zietlow21a.html.

Related Material