The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels

Michael Pearce
Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference, PMLR 118:1-12, 2020.

Abstract

We consider the problem of unsupervised learning of a low dimensional, interpretable, latent state of a video containing a moving object. The problem of distilling interpretable dynamics from pixels has been extensively considered through the lens of graphical/state space models (Fraccaro et al., 2017; Lin et al., 2018; Pearce et al., 2018; Chiappa and Paquet, 2019) that exploit Markov structure for cheap computation and structured priors for enforcing interpretability on latent representations. We take a step towards extending these approaches by discarding the Markov structure; inspired by Gaussian process dynamical models (Wang et al., 2006), we instead repurpose the recently proposed Gaussian Process Prior Variational Autoencoder (Casale et al., 2018) for learning interpretable latent dynamics. We describe the model and perform experiments on a synthetic dataset and see that the model reliably reconstructs smooth dynamics exhibiting U-turns and loops. We also observe that this model may be trained without any annealing or freeze-thaw of training parameters in contrast to previous works, albeit for slightly dierent use cases, where application specic training tricks are often required.

Cite this Paper


BibTeX
@InProceedings{pmlr-v118-pearce20a, title = { The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels}, author = {Pearce, Michael}, booktitle = {Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference}, pages = {1--12}, year = {2020}, editor = {Zhang, Cheng and Ruiz, Francisco and Bui, Thang and Dieng, Adji Bousso and Liang, Dawen}, volume = {118}, series = {Proceedings of Machine Learning Research}, month = {08 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v118/pearce20a/pearce20a.pdf}, url = {https://proceedings.mlr.press/v118/pearce20a.html}, abstract = { We consider the problem of unsupervised learning of a low dimensional, interpretable, latent state of a video containing a moving object. The problem of distilling interpretable dynamics from pixels has been extensively considered through the lens of graphical/state space models (Fraccaro et al., 2017; Lin et al., 2018; Pearce et al., 2018; Chiappa and Paquet, 2019) that exploit Markov structure for cheap computation and structured priors for enforcing interpretability on latent representations. We take a step towards extending these approaches by discarding the Markov structure; inspired by Gaussian process dynamical models (Wang et al., 2006), we instead repurpose the recently proposed Gaussian Process Prior Variational Autoencoder (Casale et al., 2018) for learning interpretable latent dynamics. We describe the model and perform experiments on a synthetic dataset and see that the model reliably reconstructs smooth dynamics exhibiting U-turns and loops. We also observe that this model may be trained without any annealing or freeze-thaw of training parameters in contrast to previous works, albeit for slightly dierent use cases, where application specic training tricks are often required.} }
Endnote
%0 Conference Paper %T The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels %A Michael Pearce %B Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference %C Proceedings of Machine Learning Research %D 2020 %E Cheng Zhang %E Francisco Ruiz %E Thang Bui %E Adji Bousso Dieng %E Dawen Liang %F pmlr-v118-pearce20a %I PMLR %P 1--12 %U https://proceedings.mlr.press/v118/pearce20a.html %V 118 %X We consider the problem of unsupervised learning of a low dimensional, interpretable, latent state of a video containing a moving object. The problem of distilling interpretable dynamics from pixels has been extensively considered through the lens of graphical/state space models (Fraccaro et al., 2017; Lin et al., 2018; Pearce et al., 2018; Chiappa and Paquet, 2019) that exploit Markov structure for cheap computation and structured priors for enforcing interpretability on latent representations. We take a step towards extending these approaches by discarding the Markov structure; inspired by Gaussian process dynamical models (Wang et al., 2006), we instead repurpose the recently proposed Gaussian Process Prior Variational Autoencoder (Casale et al., 2018) for learning interpretable latent dynamics. We describe the model and perform experiments on a synthetic dataset and see that the model reliably reconstructs smooth dynamics exhibiting U-turns and loops. We also observe that this model may be trained without any annealing or freeze-thaw of training parameters in contrast to previous works, albeit for slightly dierent use cases, where application specic training tricks are often required.
APA
Pearce, M.. (2020). The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels. Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference, in Proceedings of Machine Learning Research 118:1-12 Available from https://proceedings.mlr.press/v118/pearce20a.html.

Related Material