← Back ICRA 2026

PegasusFlow: Parallel Rolling-Denoising Score Sampling for Robot Diffusion Planner Flow Matching

Lei Ye, Haibo Gao, Peng Xu, Zhelin Zhang, Wei Zhang, Junqi Shan, Ao Zhang, Ruyi Zhou, Zongquan Deng, Liang Ding

PDF

AI summary

Key figure (auto-extracted from paper)

PegasusFlow enables expert-data-free training of robot diffusion planners by directly sampling trajectory score gradients in parallel, achieving faster convergence and higher success rates than baselines.

Diffusion Planning Score Sampling Trajectory Optimization Parallel Simulation Robot Locomotion Model Predictive Control

Problem

Diffusion-based robot planners currently rely on costly and impractical imitation learning from expert demonstrations, while existing direct score estimation methods lack the computational efficiency and parallel scalability needed for real-world deployment.

Approach

The authors introduce a parallel rolling-denoising framework that uses Weighted Basis Function Optimization (WBFO) and an asynchronous parallel simulation architecture to directly estimate trajectory score gradients from environmental interactions, completely bypassing expert data.

Key results

Parallel score sampling framework enabling pure score-matching training without expert demonstrations
Weighted Basis Function Optimization (WBFO) algorithm achieving faster convergence and superior sample efficiency over MPPI
Structured noise sampling schema combining Latin Hypercube Sampling, hierarchical ramp scheduling, and RL warm-start
100% success rate and 18% speedup in challenging barrier-crossing tasks compared to baselines

Why it matters

Provides a scalable, data-efficient pathway for training diffusion-based robot planners, making complex terrain navigation and real-time control more accessible without costly expert datasets.

Abstract

Diffusion models offer powerful generative ca- pabilities for robot trajectory planning, yet their practical deployment on robots is hindered by a critical bottleneck: reliance on imitation learning from expert demonstrations. This paradigm is problematic as it is often impractical to produce high quality data for specialized robots, and it creates an inefficient, theoretically suboptimal training pipeline. To overcome this, we introduce PegasusFlow, a parallel rolling- denoising framework that enables direct sampling of trajectory score gradients from environmental interaction, completely bypassing the need for expert data. Our core innovation is a sampling algorithm called Weighted Basis Function Opti- mization (WBFO), which leverages spline basis representations to achieve superior sample efficiency and faster convergence compared to traditional methods like MPPI. The framework is embedded within a scalable, asynchronous parallel simula- tion architecture that supports massively parallel rollouts for efficient data collection. Extensive experiments on trajectory optimization and robotic navigation tasks demonstrate that our approach, particularly Action-Value WBFO (AVWBFO) combined with a reinforcement learning warm-start, signifi- cantly outperforms baselines. In a challenging barrier-crossing task, our method achieved a 100% success rate and was 18% faster than the next-best method, validating its effectiveness for complex terrain locomotion planning. https://masteryip. github.io/pegasusflow.github.io/

Index terms

Integrated Planning and Control Machine Learning for Robot Control Simulation and Animation