Spatial Reasoning with Denoising Models

Christopher Wewer, Bartlomiej Pogodzinski, Bernt Schiele, Jan Eric Lenssen
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:66706-66725, 2025.

Abstract

We introduce Spatial Reasoning Models (SRMs), a framework to perform reasoning over sets of continuous variables via denoising generative models. SRMs infer continuous representations on a set of unobserved variables, given observations on observed variables. Current generative models on spatial domains, such as diffusion and flow matching models, often collapse to hallucination in case of complex distributions. To measure this, we introduce a set of benchmark tasks that test the quality of complex reasoning in generative models and can quantify hallucination. The SRM framework allows to report key findings about importance of sequentialization in generation, the associated order, as well as the sampling strategies during training. It demonstrates, for the first time, that order of generation can successfully be predicted by the denoising network itself. Using these findings, we can increase the accuracy of specific reasoning tasks from $<$1% to $>$50%. Our project website provides additional videos, code, and the benchmark datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wewer25a, title = {Spatial Reasoning with Denoising Models}, author = {Wewer, Christopher and Pogodzinski, Bartlomiej and Schiele, Bernt and Lenssen, Jan Eric}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {66706--66725}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wewer25a/wewer25a.pdf}, url = {https://proceedings.mlr.press/v267/wewer25a.html}, abstract = {We introduce Spatial Reasoning Models (SRMs), a framework to perform reasoning over sets of continuous variables via denoising generative models. SRMs infer continuous representations on a set of unobserved variables, given observations on observed variables. Current generative models on spatial domains, such as diffusion and flow matching models, often collapse to hallucination in case of complex distributions. To measure this, we introduce a set of benchmark tasks that test the quality of complex reasoning in generative models and can quantify hallucination. The SRM framework allows to report key findings about importance of sequentialization in generation, the associated order, as well as the sampling strategies during training. It demonstrates, for the first time, that order of generation can successfully be predicted by the denoising network itself. Using these findings, we can increase the accuracy of specific reasoning tasks from $<$1% to $>$50%. Our project website provides additional videos, code, and the benchmark datasets.} }
Endnote
%0 Conference Paper %T Spatial Reasoning with Denoising Models %A Christopher Wewer %A Bartlomiej Pogodzinski %A Bernt Schiele %A Jan Eric Lenssen %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wewer25a %I PMLR %P 66706--66725 %U https://proceedings.mlr.press/v267/wewer25a.html %V 267 %X We introduce Spatial Reasoning Models (SRMs), a framework to perform reasoning over sets of continuous variables via denoising generative models. SRMs infer continuous representations on a set of unobserved variables, given observations on observed variables. Current generative models on spatial domains, such as diffusion and flow matching models, often collapse to hallucination in case of complex distributions. To measure this, we introduce a set of benchmark tasks that test the quality of complex reasoning in generative models and can quantify hallucination. The SRM framework allows to report key findings about importance of sequentialization in generation, the associated order, as well as the sampling strategies during training. It demonstrates, for the first time, that order of generation can successfully be predicted by the denoising network itself. Using these findings, we can increase the accuracy of specific reasoning tasks from $<$1% to $>$50%. Our project website provides additional videos, code, and the benchmark datasets.
APA
Wewer, C., Pogodzinski, B., Schiele, B. & Lenssen, J.E.. (2025). Spatial Reasoning with Denoising Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:66706-66725 Available from https://proceedings.mlr.press/v267/wewer25a.html.

Related Material