Single-Shot Scene Reconstruction

Sergey Zakharov, Rares Andrei Ambrus, Vitor Campagnolo Guizilini, Dennis Park, Wadim Kehl, Fredo Durand, Joshua B. Tenenbaum, Vincent Sitzmann, Jiajun Wu, Adrien Gaidon
Proceedings of the 5th Conference on Robot Learning, PMLR 164:501-512, 2022.

Abstract

We introduce a novel scene reconstruction method to infer a fully editable and re-renderable model of a 3D road scene from a single image. We represent movable objects separately from the immovable background, and recover a full 3D model of each distinct object as well as their spatial relations in the scene. We leverage transformer-based detectors and neural implicit 3D representations and we build a Scene Decomposition Network (SDN) that reconstructs the scene in 3D. Furthermore, we show that this reconstruction can be used in an analysis-by-synthesis setting via differentiable rendering. Trained only on simulated road scenes, our method generalizes well to real data in the same class without any adaptation thanks to its strong inductive priors. Experiments on two synthetic-real dataset pairs (PD-DDAD and VKITTI-KITTI) show that our method can robustly recover scene geometry and appearance, as well as reconstruct and re-render the scene from novel viewpoints.

Cite this Paper


BibTeX
@InProceedings{pmlr-v164-zakharov22a, title = {Single-Shot Scene Reconstruction}, author = {Zakharov, Sergey and Ambrus, Rares Andrei and Guizilini, Vitor Campagnolo and Park, Dennis and Kehl, Wadim and Durand, Fredo and Tenenbaum, Joshua B. and Sitzmann, Vincent and Wu, Jiajun and Gaidon, Adrien}, booktitle = {Proceedings of the 5th Conference on Robot Learning}, pages = {501--512}, year = {2022}, editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard}, volume = {164}, series = {Proceedings of Machine Learning Research}, month = {08--11 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v164/zakharov22a/zakharov22a.pdf}, url = {https://proceedings.mlr.press/v164/zakharov22a.html}, abstract = {We introduce a novel scene reconstruction method to infer a fully editable and re-renderable model of a 3D road scene from a single image. We represent movable objects separately from the immovable background, and recover a full 3D model of each distinct object as well as their spatial relations in the scene. We leverage transformer-based detectors and neural implicit 3D representations and we build a Scene Decomposition Network (SDN) that reconstructs the scene in 3D. Furthermore, we show that this reconstruction can be used in an analysis-by-synthesis setting via differentiable rendering. Trained only on simulated road scenes, our method generalizes well to real data in the same class without any adaptation thanks to its strong inductive priors. Experiments on two synthetic-real dataset pairs (PD-DDAD and VKITTI-KITTI) show that our method can robustly recover scene geometry and appearance, as well as reconstruct and re-render the scene from novel viewpoints.} }
Endnote
%0 Conference Paper %T Single-Shot Scene Reconstruction %A Sergey Zakharov %A Rares Andrei Ambrus %A Vitor Campagnolo Guizilini %A Dennis Park %A Wadim Kehl %A Fredo Durand %A Joshua B. Tenenbaum %A Vincent Sitzmann %A Jiajun Wu %A Adrien Gaidon %B Proceedings of the 5th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2022 %E Aleksandra Faust %E David Hsu %E Gerhard Neumann %F pmlr-v164-zakharov22a %I PMLR %P 501--512 %U https://proceedings.mlr.press/v164/zakharov22a.html %V 164 %X We introduce a novel scene reconstruction method to infer a fully editable and re-renderable model of a 3D road scene from a single image. We represent movable objects separately from the immovable background, and recover a full 3D model of each distinct object as well as their spatial relations in the scene. We leverage transformer-based detectors and neural implicit 3D representations and we build a Scene Decomposition Network (SDN) that reconstructs the scene in 3D. Furthermore, we show that this reconstruction can be used in an analysis-by-synthesis setting via differentiable rendering. Trained only on simulated road scenes, our method generalizes well to real data in the same class without any adaptation thanks to its strong inductive priors. Experiments on two synthetic-real dataset pairs (PD-DDAD and VKITTI-KITTI) show that our method can robustly recover scene geometry and appearance, as well as reconstruct and re-render the scene from novel viewpoints.
APA
Zakharov, S., Ambrus, R.A., Guizilini, V.C., Park, D., Kehl, W., Durand, F., Tenenbaum, J.B., Sitzmann, V., Wu, J. & Gaidon, A.. (2022). Single-Shot Scene Reconstruction. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:501-512 Available from https://proceedings.mlr.press/v164/zakharov22a.html.

Related Material