← Back ICRA 2026

ParkDiffusion++: Ego Intention Conditioned Joint Trajectory Prediction for Automated Parking Using Diffusion Models

Jiarong WEI, Anna Rehr, Christian Feist, Abhinav Valada

PDF

AI summary

Key figure (auto-extracted from paper)

ParkDiffusion++ jointly predicts ego intentions and multi-agent trajectories using diffusion models and counterfactual distillation, enabling safe what-if decision-making for automated parking.

Automated parking trajectory prediction diffusion models ego intention counterfactual reasoning multi-agent interaction

Problem

Automated parking requires predicting multiple plausible ego intentions and the corresponding joint responses of surrounding agents, but existing methods treat these interdependent problems in isolation and lack supervision for counterfactual scenarios.

Approach

The method uses a two-stage framework: first, an ego intention tokenizer predicts discrete endpoint intentions from scene context; second, an ego-conditioned joint predictor generates socially consistent multi-agent trajectories, refined by a safety-guided denoiser and trained with counterfactual knowledge distillation to handle unobserved what-if scenarios.

Key results

State-of-the-art performance on Dragon Lake Parking and inD datasets
Socially consistent joint trajectories conditioned on alternative ego intentions
Novel counterfactual knowledge distillation module for unobserved scenarios
Accurate what-if predictions showing appropriate reactive behaviors from surrounding agents

Why it matters

Enables safer and more robust decision-making for automated parking systems by modeling complex multi-agent interactions and counterfactual scenarios.

Abstract

Automated parking is a challenging operational domain for advanced driver assistance systems, requiring robust scene understanding and interaction reasoning. The key challenge is twofold: (i) predict multiple plausible ego intentions according to context and (ii) for each intention, predict the joint responses of surrounding agents, enabling effective what-if decision-making. However, existing methods often fall short, typically treating these interdependent problems in isolation. We propose ParkDiffusion++, which jointly learns a multi-modal ego intention predictor and an ego conditioned multi-agent joint trajectory predictor for automated parking. Our approach makes several key contributions. First, we introduce an ego intention tokenizer that predicts a small set of discrete endpoint intentions from agent histories and vectorized map polylines. Second, we perform ego intention conditioned joint prediction, yielding socially consistent predictions of the surrounding agents for each possible ego intention. Third, we employ a lightweight safety-guided denoiser with different constraints to refine joint scenes during training, thus improving accuracy and safety. Fourth, we propose counterfactual knowledge distillation, where an EMA teacher refined by a frozen safety-guided denoiser provides pseudo-targets that capture how agents react to alternative ego intentions. Extensive evaluations demonstrate that ParkDiffusion++ achieves state-of-the-art performance on the Dragon Lake Parking (DLP) dataset and the Intersections Drone (inD) dataset. Importantly, qualitative what-if visualizations show that other agents react appropriately to different ego intentions.

Index terms

Intelligent Transportation Systems AI-Based Methods Behavior-Based Systems