Faster Attend-Infer-Repeat with Tractable Probabilistic Models

Karl Stelzner, Robert Peharz, Kristian Kersting
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5966-5975, 2019.

Abstract

The recent Attend-Infer-Repeat (AIR) framework marks a milestone in structured probabilistic modeling, as it tackles the challenging problem of unsupervised scene understanding via Bayesian inference. AIR expresses the composition of visual scenes from individual objects, and uses variational autoencoders to model the appearance of those objects. However, inference in the overall model is highly intractable, which hampers its learning speed and makes it prone to suboptimal solutions. In this paper, we show that the speed and robustness of learning in AIR can be considerably improved by replacing the intractable object representations with tractable probabilistic models. In particular, we opt for sum-product networks (SPNs), expressive deep probabilistic models with a rich set of tractable inference routines. The resulting model, called SuPAIR, learns an order of magnitude faster than AIR, treats object occlusions in a consistent manner, and allows for the inclusion of a background noise model, improving the robustness of Bayesian scene understanding.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-stelzner19a, title = {Faster Attend-Infer-Repeat with Tractable Probabilistic Models}, author = {Stelzner, Karl and Peharz, Robert and Kersting, Kristian}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {5966--5975}, year = {2019}, editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/stelzner19a/stelzner19a.pdf}, url = { http://proceedings.mlr.press/v97/stelzner19a.html }, abstract = {The recent Attend-Infer-Repeat (AIR) framework marks a milestone in structured probabilistic modeling, as it tackles the challenging problem of unsupervised scene understanding via Bayesian inference. AIR expresses the composition of visual scenes from individual objects, and uses variational autoencoders to model the appearance of those objects. However, inference in the overall model is highly intractable, which hampers its learning speed and makes it prone to suboptimal solutions. In this paper, we show that the speed and robustness of learning in AIR can be considerably improved by replacing the intractable object representations with tractable probabilistic models. In particular, we opt for sum-product networks (SPNs), expressive deep probabilistic models with a rich set of tractable inference routines. The resulting model, called SuPAIR, learns an order of magnitude faster than AIR, treats object occlusions in a consistent manner, and allows for the inclusion of a background noise model, improving the robustness of Bayesian scene understanding.} }
Endnote
%0 Conference Paper %T Faster Attend-Infer-Repeat with Tractable Probabilistic Models %A Karl Stelzner %A Robert Peharz %A Kristian Kersting %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-stelzner19a %I PMLR %P 5966--5975 %U http://proceedings.mlr.press/v97/stelzner19a.html %V 97 %X The recent Attend-Infer-Repeat (AIR) framework marks a milestone in structured probabilistic modeling, as it tackles the challenging problem of unsupervised scene understanding via Bayesian inference. AIR expresses the composition of visual scenes from individual objects, and uses variational autoencoders to model the appearance of those objects. However, inference in the overall model is highly intractable, which hampers its learning speed and makes it prone to suboptimal solutions. In this paper, we show that the speed and robustness of learning in AIR can be considerably improved by replacing the intractable object representations with tractable probabilistic models. In particular, we opt for sum-product networks (SPNs), expressive deep probabilistic models with a rich set of tractable inference routines. The resulting model, called SuPAIR, learns an order of magnitude faster than AIR, treats object occlusions in a consistent manner, and allows for the inclusion of a background noise model, improving the robustness of Bayesian scene understanding.
APA
Stelzner, K., Peharz, R. & Kersting, K.. (2019). Faster Attend-Infer-Repeat with Tractable Probabilistic Models. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:5966-5975 Available from http://proceedings.mlr.press/v97/stelzner19a.html .

Related Material