Identifiable Object Representations under Spatial Ambiguities

Avinash Kori, Francesca Toni, Ben Glocker
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:31486-31518, 2025.

Abstract

Modular object-centric representations are essential for human-like reasoning but are challenging to obtain under spatial ambiguities, e.g. due to occlusions and view ambiguities. However, addressing challenges presents both theoretical and practical difficulties. We introduce a novel multi-view probabilistic approach that aggregates view-specific slots to capture invariant content information while simultaneously learning disentangled global viewpoint-level information. Unlike prior single-view methods, our approach resolves spatial ambiguities, provides theoretical guarantees for identifiability, and requires no viewpoint annotations. Extensive experiments on standard benchmarks and novel complex datasets validate our method’s robustness and scalability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-kori25a, title = {Identifiable Object Representations under Spatial Ambiguities}, author = {Kori, Avinash and Toni, Francesca and Glocker, Ben}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {31486--31518}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/kori25a/kori25a.pdf}, url = {https://proceedings.mlr.press/v267/kori25a.html}, abstract = {Modular object-centric representations are essential for human-like reasoning but are challenging to obtain under spatial ambiguities, e.g. due to occlusions and view ambiguities. However, addressing challenges presents both theoretical and practical difficulties. We introduce a novel multi-view probabilistic approach that aggregates view-specific slots to capture invariant content information while simultaneously learning disentangled global viewpoint-level information. Unlike prior single-view methods, our approach resolves spatial ambiguities, provides theoretical guarantees for identifiability, and requires no viewpoint annotations. Extensive experiments on standard benchmarks and novel complex datasets validate our method’s robustness and scalability.} }
Endnote
%0 Conference Paper %T Identifiable Object Representations under Spatial Ambiguities %A Avinash Kori %A Francesca Toni %A Ben Glocker %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-kori25a %I PMLR %P 31486--31518 %U https://proceedings.mlr.press/v267/kori25a.html %V 267 %X Modular object-centric representations are essential for human-like reasoning but are challenging to obtain under spatial ambiguities, e.g. due to occlusions and view ambiguities. However, addressing challenges presents both theoretical and practical difficulties. We introduce a novel multi-view probabilistic approach that aggregates view-specific slots to capture invariant content information while simultaneously learning disentangled global viewpoint-level information. Unlike prior single-view methods, our approach resolves spatial ambiguities, provides theoretical guarantees for identifiability, and requires no viewpoint annotations. Extensive experiments on standard benchmarks and novel complex datasets validate our method’s robustness and scalability.
APA
Kori, A., Toni, F. & Glocker, B.. (2025). Identifiable Object Representations under Spatial Ambiguities. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:31486-31518 Available from https://proceedings.mlr.press/v267/kori25a.html.

Related Material