← Back ICRA 2026

EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction Via Polynomial Representations

Yue Yao, Mohamed-Khalil Bouzidi, Daniel Goehring, Joerg Reichardt

PDF

AI summary

Key figure (auto-extracted from paper)

EP-Diffuser leverages polynomial representations in a diffusion model to efficiently generate diverse, plausible traffic scenes, outperforming larger state-of-the-art models in plausibility and out-of-distribution generalization.

Traffic scene prediction Diffusion models Polynomial representations Autonomous driving Out-of-distribution generalization Generative modeling

Problem

Current traffic prediction models rely on regression-based metrics that only capture the most likely future, ignoring the multi-modal distribution of plausible alternatives essential for safe autonomous driving, while struggling to generalize to unseen environments.

Approach

The authors introduce EP-Diffuser, a parameter-efficient diffusion model that conditions on road layout and agent history, using Bernstein polynomials to represent both map geometry and trajectories for joint scene generation.

Key results

Achieves comparable accuracy to state-of-the-art models with significantly fewer parameters
Generates highly plausible and diverse traffic scene continuations
Demonstrates superior out-of-distribution generalization on the Waymo Open dataset
Reveals a significant disconnect between traditional regression metrics and actual scene plausibility

Why it matters

It enables safer autonomous vehicle planning by efficiently modeling realistic traffic uncertainty and generalizing robustly to unseen driving scenarios.

Abstract

As the prediction horizon increases, predicting the future evolution of traffic scenes becomes increasingly difficult due to the multi-modal nature of agent motion. Most state-of-the-art (SotA) prediction models primarily focus on forecasting the most likely future. However, for the safe operation of autonomous vehi- cles, it is equally important to cover the distribution for plausible motion alternatives. To address this, we introduce EP-Diffuser, a novel parameter-efficient diffusion-based generative model de- signed to capture the distribution of possible traffic scene evolu- tions. Conditioned on road layout and agent history, our model acts as a predictor and generates diverse, plausible scene continuations. We benchmark EP-Diffuser against two SotA models in terms of plausibility, diversity, and accuracy of predictions on the Argoverse 2 dataset. Despite its significantly smaller model size, our approach achieves both highly plausible and diverse traffic scene predictions with comparable accuracy. We further evaluate model generaliza- tion in an out-of-distribution (OoD) test setting using Waymo Open dataset and show superior robustness of our approach.

Index terms

Autonomous Agents Deep Learning Methods Performance Evaluation and Benchmarking