← Back ICRA 2026

ConsistencyPlanner: Real-Time Planning with Fast-Sampling Consistency Models

Qichao Zhang, Xing Fang, Jiaqi Fang, Zhenwen Cai, Jie Ling, Qiankun Yu, Dongbin Zhao

PDF

AI summary

Key figure (auto-extracted from paper)

Fast-sampling consistency models enable real-time, multimodal closed-loop driving planning with superior safety metrics compared to diffusion-based baselines.

Autonomous driving Consistency models Real-time planning Closed-loop control Generative planning Waymax benchmark

Problem

Learning-based autonomous driving planners struggle to balance modeling diverse, multimodal driving behaviors with the low-latency requirements of real-time deployment, often resulting in unsafe or computationally prohibitive actions.

Approach

ConsistencyPlanner integrates fast-sampling consistency models with an attention-enhanced decoder to fuse scene and route features, enabling efficient single-step generation of diverse driving trajectories without iterative denoising.

Key results

Achieves lowest collision rate (2.77%) and off-road rate (2.09%) on the Waymax benchmark
Delivers real-time inference at ~15ms latency, vastly outperforming diffusion models
Surpasses state-of-the-art baselines in closed-loop safety metrics
Validates that attention-based feature fusion significantly improves planning robustness

Why it matters

Offers a practical, low-latency planning solution for safety-critical autonomous vehicles that must navigate complex, dynamic traffic environments in real time.

Abstract

Closed-loop planning in complex, real-world driving scenarios presents a critical challenge for autonomous driving systems. While traditional rule-based methods are interpretable, their predefined heuristics lack the adaptability for dynamic traffic environments. Learning-based approaches have shown considerable promise. Conversely, learning-based approaches, despite their promise, struggle to balance the modeling diverse and multimodal driving behaviors and real-time planning, often leading to indecisive or unsafe actions. To address this limitation, we propose ConsistencyPlanner, a real-time planning framework with fast-sampling consistency models. Our approach is built upon two key technical contributions. Efficient Multimodal Sam- pling: We employ fast-sampling consistency models to generate a diverse set of plausible future trajectories. This enables efficient, real-time exploration of multimodal actions, overcoming the computational bottlenecks of previous iterative generative meth- ods. Heterogeneous Feature Fusion: We introduce an attention- enhanced decoder that dynamically integrates heterogeneous input features—including scene feature and action token—into a cohesive representation for robust planning. Extensive evaluation in the Waymax simulator demonstrates superior performance in safety metrics compared to existing methods, with particularly strong results in challenging dynamic scenarios.

Index terms

Imitation Learning Motion and Path Planning Autonomous Vehicle Navigation