Test-time Adaptation with Slot-Centric Models

Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd Van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:28151-28166, 2023.

Abstract

Current visual detectors, though impressive within their training distribution, often fail to parse out-of-distribution scenes into their constituent entities. Recent test-time adaptation methods use auxiliary self-supervised losses to adapt the network parameters to each test example independently and have shown promising results towards generalization outside the training distribution for the task of image classification. In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases. Recent slot-centric generative models attempt to decompose scenes into entities in a self-supervised manner by reconstructing pixels. Drawing upon these two lines of work, we propose Slot-TTA, a semi-supervised slot-centric scene decomposition model that at test time is adapted per scene through gradient descent on reconstruction or cross-view synthesis objectives. We evaluate Slot-TTA across multiple input modalities, images or 3D point clouds, and show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods. Project Webpage: http://slot-tta.github.io/

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-prabhudesai23a, title = {Test-time Adaptation with Slot-Centric Models}, author = {Prabhudesai, Mihir and Goyal, Anirudh and Paul, Sujoy and Steenkiste, Sjoerd Van and Sajjadi, Mehdi S. M. and Aggarwal, Gaurav and Kipf, Thomas and Pathak, Deepak and Fragkiadaki, Katerina}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {28151--28166}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/prabhudesai23a/prabhudesai23a.pdf}, url = {https://proceedings.mlr.press/v202/prabhudesai23a.html}, abstract = {Current visual detectors, though impressive within their training distribution, often fail to parse out-of-distribution scenes into their constituent entities. Recent test-time adaptation methods use auxiliary self-supervised losses to adapt the network parameters to each test example independently and have shown promising results towards generalization outside the training distribution for the task of image classification. In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases. Recent slot-centric generative models attempt to decompose scenes into entities in a self-supervised manner by reconstructing pixels. Drawing upon these two lines of work, we propose Slot-TTA, a semi-supervised slot-centric scene decomposition model that at test time is adapted per scene through gradient descent on reconstruction or cross-view synthesis objectives. We evaluate Slot-TTA across multiple input modalities, images or 3D point clouds, and show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods. Project Webpage: http://slot-tta.github.io/} }
Endnote
%0 Conference Paper %T Test-time Adaptation with Slot-Centric Models %A Mihir Prabhudesai %A Anirudh Goyal %A Sujoy Paul %A Sjoerd Van Steenkiste %A Mehdi S. M. Sajjadi %A Gaurav Aggarwal %A Thomas Kipf %A Deepak Pathak %A Katerina Fragkiadaki %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-prabhudesai23a %I PMLR %P 28151--28166 %U https://proceedings.mlr.press/v202/prabhudesai23a.html %V 202 %X Current visual detectors, though impressive within their training distribution, often fail to parse out-of-distribution scenes into their constituent entities. Recent test-time adaptation methods use auxiliary self-supervised losses to adapt the network parameters to each test example independently and have shown promising results towards generalization outside the training distribution for the task of image classification. In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases. Recent slot-centric generative models attempt to decompose scenes into entities in a self-supervised manner by reconstructing pixels. Drawing upon these two lines of work, we propose Slot-TTA, a semi-supervised slot-centric scene decomposition model that at test time is adapted per scene through gradient descent on reconstruction or cross-view synthesis objectives. We evaluate Slot-TTA across multiple input modalities, images or 3D point clouds, and show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods. Project Webpage: http://slot-tta.github.io/
APA
Prabhudesai, M., Goyal, A., Paul, S., Steenkiste, S.V., Sajjadi, M.S.M., Aggarwal, G., Kipf, T., Pathak, D. & Fragkiadaki, K.. (2023). Test-time Adaptation with Slot-Centric Models. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:28151-28166 Available from https://proceedings.mlr.press/v202/prabhudesai23a.html.

Related Material