Disentangling images with Lie group transformations and sparse coding

Ho Yin Chau, Frank Qiu, Yubei Chen, Bruno Olshausen
Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations, PMLR 197:22-47, 2023.

Abstract

Discrete spatial patterns and their continuous transformations are two important regularities in natural signals. Lie groups and representation theory are mathematical tools used in previous works to model continuous image transformations. On the other hand, sparse coding is an essential tool for learning dictionaries of discrete natural signal patterns. This paper combines these ideas in a Bayesian generative model that learns to disentangle spatial patterns and their continuous transformations in a completely unsupervised manner. Images are modeled as a sparse superposition of shape components followed by a transformation parameterized by $n$ continuous variables. The shape components and transformations are not predefined but are instead adapted to learn the data’s symmetries. The constraint is that the transformations form a representation of an $n$-dimensional torus. Training the model on a dataset consisting of controlled geometric transformations of specific MNIST digits shows that it can recover these transformations along with the digits. Training on the full MNIST dataset shows that it can learn the basic digit shapes and the natural transformations such as shearing and stretching contained in this data. This work provides the simplest known Bayesian mathematical model for building unsupervised factorized representations. The source code is publicly available under MIT License.

Cite this Paper


BibTeX
@InProceedings{pmlr-v197-chau23a, title = {Disentangling images with Lie group transformations and sparse coding}, author = {Chau, Ho Yin and Qiu, Frank and Chen, Yubei and Olshausen, Bruno}, booktitle = {Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations}, pages = {22--47}, year = {2023}, editor = {Sanborn, Sophia and Shewmake, Christian and Azeglio, Simone and Di Bernardo, Arianna and Miolane, Nina}, volume = {197}, series = {Proceedings of Machine Learning Research}, month = {03 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v197/chau23a/chau23a.pdf}, url = {https://proceedings.mlr.press/v197/chau23a.html}, abstract = {Discrete spatial patterns and their continuous transformations are two important regularities in natural signals. Lie groups and representation theory are mathematical tools used in previous works to model continuous image transformations. On the other hand, sparse coding is an essential tool for learning dictionaries of discrete natural signal patterns. This paper combines these ideas in a Bayesian generative model that learns to disentangle spatial patterns and their continuous transformations in a completely unsupervised manner. Images are modeled as a sparse superposition of shape components followed by a transformation parameterized by $n$ continuous variables. The shape components and transformations are not predefined but are instead adapted to learn the data’s symmetries. The constraint is that the transformations form a representation of an $n$-dimensional torus. Training the model on a dataset consisting of controlled geometric transformations of specific MNIST digits shows that it can recover these transformations along with the digits. Training on the full MNIST dataset shows that it can learn the basic digit shapes and the natural transformations such as shearing and stretching contained in this data. This work provides the simplest known Bayesian mathematical model for building unsupervised factorized representations. The source code is publicly available under MIT License.} }
Endnote
%0 Conference Paper %T Disentangling images with Lie group transformations and sparse coding %A Ho Yin Chau %A Frank Qiu %A Yubei Chen %A Bruno Olshausen %B Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations %C Proceedings of Machine Learning Research %D 2023 %E Sophia Sanborn %E Christian Shewmake %E Simone Azeglio %E Arianna Di Bernardo %E Nina Miolane %F pmlr-v197-chau23a %I PMLR %P 22--47 %U https://proceedings.mlr.press/v197/chau23a.html %V 197 %X Discrete spatial patterns and their continuous transformations are two important regularities in natural signals. Lie groups and representation theory are mathematical tools used in previous works to model continuous image transformations. On the other hand, sparse coding is an essential tool for learning dictionaries of discrete natural signal patterns. This paper combines these ideas in a Bayesian generative model that learns to disentangle spatial patterns and their continuous transformations in a completely unsupervised manner. Images are modeled as a sparse superposition of shape components followed by a transformation parameterized by $n$ continuous variables. The shape components and transformations are not predefined but are instead adapted to learn the data’s symmetries. The constraint is that the transformations form a representation of an $n$-dimensional torus. Training the model on a dataset consisting of controlled geometric transformations of specific MNIST digits shows that it can recover these transformations along with the digits. Training on the full MNIST dataset shows that it can learn the basic digit shapes and the natural transformations such as shearing and stretching contained in this data. This work provides the simplest known Bayesian mathematical model for building unsupervised factorized representations. The source code is publicly available under MIT License.
APA
Chau, H.Y., Qiu, F., Chen, Y. & Olshausen, B.. (2023). Disentangling images with Lie group transformations and sparse coding. Proceedings of the 1st NeurIPS Workshop on Symmetry and Geometry in Neural Representations, in Proceedings of Machine Learning Research 197:22-47 Available from https://proceedings.mlr.press/v197/chau23a.html.

Related Material