Diffusion Trajectory-Guided Policy for Long-Horizon Robot Manipulation
shichao Fan, Quantao Yang, yajie liu, Kun Wu, Zhengping Che, qingjie Liu, Min Wan
AI summary
Problem
Imitation learning for long-horizon robotic tasks struggles with compounding errors and scarce demonstration data, leading to cascading failures and poor generalization.
Approach
The framework uses a two-stage process: first, a vision-language diffusion model generates task-relevant 2D trajectories, which then guide the training of a robot manipulation policy to reduce error accumulation.
Key results
- 25% higher average success rate on the CALVIN benchmark
- Trained from scratch without external pretraining
- Significant real-world robot performance improvements
- Computationally efficient training on consumer-grade GPUs
Why it matters
Enables reliable, data-efficient long-horizon robotic manipulation for real-world applications by bridging high-level language instructions with precise motor control.
Abstract
Recently, Vision-Language-Action models (VLA) have advanced robot imitation learning, but high data collection costs and limited demonstrations hinder generalization and current imitation learning methods struggle in out-of-distribution scenarios, especially for long-horizon tasks. A key challenge is how to mitigate compounding errors in imitation learning, which lead to cascading failures over extended trajectories. To address these challenges, we propose the Diffusion Trajectory- guided Policy (DTP) framework, which generates 2D trajectories through a diffusion model to guide policy learning for long- horizon tasks. By leveraging task-relevant trajectories, DTP provides trajectory-level guidance to reduce error accumula- tion. Our two-stage approach first trains a generative vision- language model to create diffusion-based trajectories, then refines the imitation policy using them. Experiments on the CALVIN benchmark show that DTP outperforms state-of-the-art baselines by 25% in success rate, starting from scratch without external pretraining. Moreover, DTP significantly improves real-world robot performance. Our project is at diffusion-trajectory-guided- policy.github.io/.