KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection

Andrea Rosasco, Federico Ceola, Giulia Pasquale, Lorenzo Natale
Proceedings of The 9th Conference on Robot Learning, PMLR 305:1210-1224, 2025.

Abstract

Learning robot policies that capture multimodality in the training data has been a long-standing open challenge for behavior cloning. Recent approaches tackle the problem by modeling the conditional action distribution with generative models. One of these approaches is Diffusion Policy, which relies on a diffusion model to denoise random points into robot action trajectories. While achieving state-of-the-art performance, it has two main drawbacks that may lead the robot out of the data distribution during policy execution. First, the stochasticity of the denoising process can highly impact on the quality of generated trajectory of actions. Second, being a supervised learning approach, it can learn data outliers from the dataset used for training. Recent work focuses on mitigating these limitations by combining Diffusion Policy either with large-scale training or with classical behavior cloning algorithms. Instead, we propose KDPE, a Kernel Density Estimation-based strategy that filters out potentially harmful trajectories output of Diffusion Policy while keeping a low test-time computational overhead. For Kernel Density Estimation, we propose a manifold-aware kernel to model a probability density function for actions composed of end-effector Cartesian position, orientation, and gripper state. KDPE overall achieves better performance than Diffusion Policy on simulated single-arm RoboMimic and MimicGen tasks, and on three real robot experiments:PickPlush, a tabletop grasping task, CubeSort, a multimodal pick and place task, and CoffeeMaking, a task that requires long-horizon capabilities and precise execution. The code will be released upon acceptance and additional material is provided on our anonymized project page:https://kdpe-robotics.github.io.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-rosasco25a, title = {KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection}, author = {Rosasco, Andrea and Ceola, Federico and Pasquale, Giulia and Natale, Lorenzo}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {1210--1224}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/rosasco25a/rosasco25a.pdf}, url = {https://proceedings.mlr.press/v305/rosasco25a.html}, abstract = {Learning robot policies that capture multimodality in the training data has been a long-standing open challenge for behavior cloning. Recent approaches tackle the problem by modeling the conditional action distribution with generative models. One of these approaches is Diffusion Policy, which relies on a diffusion model to denoise random points into robot action trajectories. While achieving state-of-the-art performance, it has two main drawbacks that may lead the robot out of the data distribution during policy execution. First, the stochasticity of the denoising process can highly impact on the quality of generated trajectory of actions. Second, being a supervised learning approach, it can learn data outliers from the dataset used for training. Recent work focuses on mitigating these limitations by combining Diffusion Policy either with large-scale training or with classical behavior cloning algorithms. Instead, we propose KDPE, a Kernel Density Estimation-based strategy that filters out potentially harmful trajectories output of Diffusion Policy while keeping a low test-time computational overhead. For Kernel Density Estimation, we propose a manifold-aware kernel to model a probability density function for actions composed of end-effector Cartesian position, orientation, and gripper state. KDPE overall achieves better performance than Diffusion Policy on simulated single-arm RoboMimic and MimicGen tasks, and on three real robot experiments:PickPlush, a tabletop grasping task, CubeSort, a multimodal pick and place task, and CoffeeMaking, a task that requires long-horizon capabilities and precise execution. The code will be released upon acceptance and additional material is provided on our anonymized project page:https://kdpe-robotics.github.io.} }
Endnote
%0 Conference Paper %T KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection %A Andrea Rosasco %A Federico Ceola %A Giulia Pasquale %A Lorenzo Natale %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-rosasco25a %I PMLR %P 1210--1224 %U https://proceedings.mlr.press/v305/rosasco25a.html %V 305 %X Learning robot policies that capture multimodality in the training data has been a long-standing open challenge for behavior cloning. Recent approaches tackle the problem by modeling the conditional action distribution with generative models. One of these approaches is Diffusion Policy, which relies on a diffusion model to denoise random points into robot action trajectories. While achieving state-of-the-art performance, it has two main drawbacks that may lead the robot out of the data distribution during policy execution. First, the stochasticity of the denoising process can highly impact on the quality of generated trajectory of actions. Second, being a supervised learning approach, it can learn data outliers from the dataset used for training. Recent work focuses on mitigating these limitations by combining Diffusion Policy either with large-scale training or with classical behavior cloning algorithms. Instead, we propose KDPE, a Kernel Density Estimation-based strategy that filters out potentially harmful trajectories output of Diffusion Policy while keeping a low test-time computational overhead. For Kernel Density Estimation, we propose a manifold-aware kernel to model a probability density function for actions composed of end-effector Cartesian position, orientation, and gripper state. KDPE overall achieves better performance than Diffusion Policy on simulated single-arm RoboMimic and MimicGen tasks, and on three real robot experiments:PickPlush, a tabletop grasping task, CubeSort, a multimodal pick and place task, and CoffeeMaking, a task that requires long-horizon capabilities and precise execution. The code will be released upon acceptance and additional material is provided on our anonymized project page:https://kdpe-robotics.github.io.
APA
Rosasco, A., Ceola, F., Pasquale, G. & Natale, L.. (2025). KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:1210-1224 Available from https://proceedings.mlr.press/v305/rosasco25a.html.

Related Material