Modeling Dynamic Environments with Scene Graph Memory

Andrey Kurenkov, Michael Lingelbach, Tanmay Agarwal, Emily Jin, Chengshu Li, Ruohan Zhang, Li Fei-Fei, Jiajun Wu, Silvio Savarese, Roberto Martı́n-Martı́n
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:17976-17993, 2023.

Abstract

Embodied AI agents that search for objects in large environments such as households often need to make efficient decisions by predicting object locations based on partial information. We pose this as a new type of link prediction problem: link prediction on partially observable dynamic graphs Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges; only parts of the changing graph are known to the agent at each timestep. This partial observability poses a challenge to existing link prediction approaches, which we address. We propose a novel state representation – Scene Graph Memory (SGM) – with captures the agent’s accumulated set of observations, as well as a neural net architecture called a Node Edge Predictor (NEP) that extracts information from the SGM to search efficiently. We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes, and show that NEP can be trained to predict the locations of objects in a variety of environments with diverse object movement dynamics, outperforming baselines both in terms of new scene adaptability and overall accuracy. The codebase and more can be found www.scenegraphmemory.com.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-kurenkov23a, title = {Modeling Dynamic Environments with Scene Graph Memory}, author = {Kurenkov, Andrey and Lingelbach, Michael and Agarwal, Tanmay and Jin, Emily and Li, Chengshu and Zhang, Ruohan and Fei-Fei, Li and Wu, Jiajun and Savarese, Silvio and Mart\'{\i}n-Mart\'{\i}n, Roberto}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {17976--17993}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/kurenkov23a/kurenkov23a.pdf}, url = {https://proceedings.mlr.press/v202/kurenkov23a.html}, abstract = {Embodied AI agents that search for objects in large environments such as households often need to make efficient decisions by predicting object locations based on partial information. We pose this as a new type of link prediction problem: link prediction on partially observable dynamic graphs Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges; only parts of the changing graph are known to the agent at each timestep. This partial observability poses a challenge to existing link prediction approaches, which we address. We propose a novel state representation – Scene Graph Memory (SGM) – with captures the agent’s accumulated set of observations, as well as a neural net architecture called a Node Edge Predictor (NEP) that extracts information from the SGM to search efficiently. We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes, and show that NEP can be trained to predict the locations of objects in a variety of environments with diverse object movement dynamics, outperforming baselines both in terms of new scene adaptability and overall accuracy. The codebase and more can be found www.scenegraphmemory.com.} }
Endnote
%0 Conference Paper %T Modeling Dynamic Environments with Scene Graph Memory %A Andrey Kurenkov %A Michael Lingelbach %A Tanmay Agarwal %A Emily Jin %A Chengshu Li %A Ruohan Zhang %A Li Fei-Fei %A Jiajun Wu %A Silvio Savarese %A Roberto Martı́n-Martı́n %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-kurenkov23a %I PMLR %P 17976--17993 %U https://proceedings.mlr.press/v202/kurenkov23a.html %V 202 %X Embodied AI agents that search for objects in large environments such as households often need to make efficient decisions by predicting object locations based on partial information. We pose this as a new type of link prediction problem: link prediction on partially observable dynamic graphs Our graph is a representation of a scene in which rooms and objects are nodes, and their relationships are encoded in the edges; only parts of the changing graph are known to the agent at each timestep. This partial observability poses a challenge to existing link prediction approaches, which we address. We propose a novel state representation – Scene Graph Memory (SGM) – with captures the agent’s accumulated set of observations, as well as a neural net architecture called a Node Edge Predictor (NEP) that extracts information from the SGM to search efficiently. We evaluate our method in the Dynamic House Simulator, a new benchmark that creates diverse dynamic graphs following the semantic patterns typically seen at homes, and show that NEP can be trained to predict the locations of objects in a variety of environments with diverse object movement dynamics, outperforming baselines both in terms of new scene adaptability and overall accuracy. The codebase and more can be found www.scenegraphmemory.com.
APA
Kurenkov, A., Lingelbach, M., Agarwal, T., Jin, E., Li, C., Zhang, R., Fei-Fei, L., Wu, J., Savarese, S. & Martı́n-Martı́n, R.. (2023). Modeling Dynamic Environments with Scene Graph Memory. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:17976-17993 Available from https://proceedings.mlr.press/v202/kurenkov23a.html.

Related Material