← Back ICRA 2026

Accelerated Multi-Modal Motion Planning Using Context-Conditioned Diffusion Models

Edward Sandra, Lander Vanroye, Dries Dirckx, Ruben Cartuyvels, Jan Swevers, Wilm Decré

PDF

AI summary

Key figure (auto-extracted from paper)

CAMPD enables rapid, generalizable, multi-modal motion planning for robots in unseen environments without retraining or camera dependencies.

Motion Planning Diffusion Models Robot Navigation Generalization Real-Time Planning Context-Aware AI

Problem

Classical motion planners struggle with scalability in high-dimensional, complex environments, while existing learning-based diffusion models either lack generalization to unseen environments or rely on specific sensors like cameras.

Approach

CAMPD uses a classifier-free diffusion model conditioned on sensor-agnostic contextual parameters via a U-Net with an attention mechanism, enabling planning-as-inference.

Key results

Significantly improved generalization to unseen environments
Real-time trajectory generation (~0.066s per batch)
Higher success and feasibility rates than classical and learning-based baselines
Supports arbitrary numbers of contextual elements like obstacles

Why it matters

Enables real-time, adaptive robot navigation in dynamic, cluttered settings, making diffusion-based planning practical for real-world robotic deployment.

Abstract

Classical methods in robot motion planning, such as sampling-based and optimization-based methods, often strug- gle with scalability towards higher-dimensional state spaces and complex environments. Diffusion models, known for their capa- bility to learn complex, high-dimensional and multi-modal data distributions, provide a promising alternative when applied to motion planning problems and have already shown interesting results. However, most of the current approaches train their model for a single environment, limiting their generalization to environments not seen during training. The techniques that do train a model for multiple environments rely on a specific camera to provide the model with the necessary environmental information and therefore always require that sensor. To effec- tively adapt to diverse scenarios without the need for retraining, this research proposes Context-Aware Motion Planning Diffu- sion (CAMPD). CAMPD leverages a classifier-free denoising probabilistic diffusion model, conditioned on sensor-agnostic contextual information. An attention mechanism, integrated in the well-known U-Net architecture, conditions the model on an arbitrary number of contextual parameters. CAMPD is evaluated on a 7-DoF robot manipulator and benchmarked against state-of-the-art approaches on real-world tasks, showing its ability to generalize to unseen environments and generate high-quality, multi-modal trajectories, at a fraction of the time required by existing methods.

Index terms

Imitation Learning Task and Motion Planning