Disentangled Representations for Sequence Data using Information Bottleneck Principle
Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:305-320, 2020.
We propose the factorizing variational autoencoder (FAVAE), a generative model for learning dis- entangled representations from sequential data via the information bottleneck principle without supervision. Real-world data are often generated by a few explanatory factors of variation, and disentangled representation learning obtains these factors from the data. We focus on the disen- tangled representation of sequential data which can be useful in a wide range of applications, such as video, speech, and stock markets. Factors in sequential data are categorized into dynamic and static ones: dynamic factors are time dependent, and static factors are time independent. Previous models disentangle between static and dynamic factors and between dynamic factors with different time dependencies by explicitly modeling the priors of latent variables. However, these models cannot disentangle representations between dynamic factors with the same time dependency, such as disentangling “picking up” and “throwing” in robotic tasks. On the other hand, FAVAE can disentangle multiple dynamic factors via the information bottleneck principle where it does not require modeling priors. We conducted experiments to show that FAVAE can extract disentangled dynamic factors on synthetic, video, and speech datasets.