ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation

Zhou Xian, Nikolaos Gkanatsios, Theophile Gervet, Tsung-Wei Ke, Katerina Fragkiadaki
Proceedings of The 7th Conference on Robot Learning, PMLR 229:2323-2339, 2023.

Abstract

We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and trajectory diffusion generation for learning robot manipulation from demonstrations. Our main innovation is to use a global transformer-based action predictor to predict actions at keyframes, a task that requires multi- modal semantic scene understanding, and to use a local trajectory diffuser to predict trajectory segments that connect predicted macro-actions. ChainedDiffuser sets a new record on established manipulation benchmarks, and outperforms both state-of-the-art keypose (macro-action) prediction models that use motion plan- ners for trajectory prediction, and trajectory diffusion policies that do not predict keyframe macro-actions. We conduct experiments in both simulated and real-world environments and demonstrate ChainedDiffuser’s ability to solve a wide range of manipulation tasks involving interactions with diverse objects.

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-xian23a, title = {ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation}, author = {Xian, Zhou and Gkanatsios, Nikolaos and Gervet, Theophile and Ke, Tsung-Wei and Fragkiadaki, Katerina}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {2323--2339}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/xian23a/xian23a.pdf}, url = {https://proceedings.mlr.press/v229/xian23a.html}, abstract = {We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and trajectory diffusion generation for learning robot manipulation from demonstrations. Our main innovation is to use a global transformer-based action predictor to predict actions at keyframes, a task that requires multi- modal semantic scene understanding, and to use a local trajectory diffuser to predict trajectory segments that connect predicted macro-actions. ChainedDiffuser sets a new record on established manipulation benchmarks, and outperforms both state-of-the-art keypose (macro-action) prediction models that use motion plan- ners for trajectory prediction, and trajectory diffusion policies that do not predict keyframe macro-actions. We conduct experiments in both simulated and real-world environments and demonstrate ChainedDiffuser’s ability to solve a wide range of manipulation tasks involving interactions with diverse objects.} }
Endnote
%0 Conference Paper %T ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation %A Zhou Xian %A Nikolaos Gkanatsios %A Theophile Gervet %A Tsung-Wei Ke %A Katerina Fragkiadaki %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-xian23a %I PMLR %P 2323--2339 %U https://proceedings.mlr.press/v229/xian23a.html %V 229 %X We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and trajectory diffusion generation for learning robot manipulation from demonstrations. Our main innovation is to use a global transformer-based action predictor to predict actions at keyframes, a task that requires multi- modal semantic scene understanding, and to use a local trajectory diffuser to predict trajectory segments that connect predicted macro-actions. ChainedDiffuser sets a new record on established manipulation benchmarks, and outperforms both state-of-the-art keypose (macro-action) prediction models that use motion plan- ners for trajectory prediction, and trajectory diffusion policies that do not predict keyframe macro-actions. We conduct experiments in both simulated and real-world environments and demonstrate ChainedDiffuser’s ability to solve a wide range of manipulation tasks involving interactions with diverse objects.
APA
Xian, Z., Gkanatsios, N., Gervet, T., Ke, T. & Fragkiadaki, K.. (2023). ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:2323-2339 Available from https://proceedings.mlr.press/v229/xian23a.html.

Related Material