EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction Via Polynomial Representations
Yue Yao, Mohamed-Khalil Bouzidi, Daniel Goehring, Joerg Reichardt
AI summary
Problem
Current traffic prediction models rely on regression-based metrics that only capture the most likely future, ignoring the multi-modal distribution of plausible alternatives essential for safe autonomous driving, while struggling to generalize to unseen environments.
Approach
The authors introduce EP-Diffuser, a parameter-efficient diffusion model that conditions on road layout and agent history, using Bernstein polynomials to represent both map geometry and trajectories for joint scene generation.
Key results
- Achieves comparable accuracy to state-of-the-art models with significantly fewer parameters
- Generates highly plausible and diverse traffic scene continuations
- Demonstrates superior out-of-distribution generalization on the Waymo Open dataset
- Reveals a significant disconnect between traditional regression metrics and actual scene plausibility
Why it matters
It enables safer autonomous vehicle planning by efficiently modeling realistic traffic uncertainty and generalizing robustly to unseen driving scenarios.
Abstract
As the prediction horizon increases, predicting the future evolution of traffic scenes becomes increasingly difficult due to the multi-modal nature of agent motion. Most state-of-the-art (SotA) prediction models primarily focus on forecasting the most likely future. However, for the safe operation of autonomous vehi- cles, it is equally important to cover the distribution for plausible motion alternatives. To address this, we introduce EP-Diffuser, a novel parameter-efficient diffusion-based generative model de- signed to capture the distribution of possible traffic scene evolu- tions. Conditioned on road layout and agent history, our model acts as a predictor and generates diverse, plausible scene continuations. We benchmark EP-Diffuser against two SotA models in terms of plausibility, diversity, and accuracy of predictions on the Argoverse 2 dataset. Despite its significantly smaller model size, our approach achieves both highly plausible and diverse traffic scene predictions with comparable accuracy. We further evaluate model generaliza- tion in an out-of-distribution (OoD) test setting using Waymo Open dataset and show superior robustness of our approach.