Modeling the Real World with High-Density Visual Particle Dynamics

William F Whitney, Jake Varley, Deepali Jain, Krzysztof Marcin Choromanski, Sumeet Singh, Vikas Sindhwani
Proceedings of The 8th Conference on Robot Learning, PMLR 270:1427-1442, 2025.

Abstract

We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neighbour attention layers. We demonstrate the capabilities of HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with two RGB-D cameras. Compared to the previous graph neural network approach, our Interlacer dynamics is twice as fast with the same prediction quality, and can achieve higher quality using 4x as many particles. We illustrate how HD-VPD can evaluate motion plan quality with robotic box pushing and can grasping tasks. See videos and particle dynamics rendered by HD-VPD at https://sites.google.com/view/hd-vpd.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-whitney25a, title = {Modeling the Real World with High-Density Visual Particle Dynamics}, author = {Whitney, William F and Varley, Jake and Jain, Deepali and Choromanski, Krzysztof Marcin and Singh, Sumeet and Sindhwani, Vikas}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {1427--1442}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/whitney25a/whitney25a.pdf}, url = {https://proceedings.mlr.press/v270/whitney25a.html}, abstract = {We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neighbour attention layers. We demonstrate the capabilities of HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with two RGB-D cameras. Compared to the previous graph neural network approach, our Interlacer dynamics is twice as fast with the same prediction quality, and can achieve higher quality using 4x as many particles. We illustrate how HD-VPD can evaluate motion plan quality with robotic box pushing and can grasping tasks. See videos and particle dynamics rendered by HD-VPD at https://sites.google.com/view/hd-vpd.} }
Endnote
%0 Conference Paper %T Modeling the Real World with High-Density Visual Particle Dynamics %A William F Whitney %A Jake Varley %A Deepali Jain %A Krzysztof Marcin Choromanski %A Sumeet Singh %A Vikas Sindhwani %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-whitney25a %I PMLR %P 1427--1442 %U https://proceedings.mlr.press/v270/whitney25a.html %V 270 %X We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neighbour attention layers. We demonstrate the capabilities of HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with two RGB-D cameras. Compared to the previous graph neural network approach, our Interlacer dynamics is twice as fast with the same prediction quality, and can achieve higher quality using 4x as many particles. We illustrate how HD-VPD can evaluate motion plan quality with robotic box pushing and can grasping tasks. See videos and particle dynamics rendered by HD-VPD at https://sites.google.com/view/hd-vpd.
APA
Whitney, W.F., Varley, J., Jain, D., Choromanski, K.M., Singh, S. & Sindhwani, V.. (2025). Modeling the Real World with High-Density Visual Particle Dynamics. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:1427-1442 Available from https://proceedings.mlr.press/v270/whitney25a.html.

Related Material