Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation
Fabian Konstantinidis, Moritz Sackmann, Ulrich Franz Hofmann, Christoph Stiller
AI summary
Problem
Multi-agent driving simulation requires behavior models that are both computationally efficient and robust to diverse scenarios, but existing approaches struggle with scaling or pose-invariance. Additionally, training these models via reinforcement learning often fails to balance realism and robustness due to fixed reward scaling.
Approach
The authors encode each traffic participant and map element in its own local coordinate frame to enable shared, viewpoint-invariant feature extraction and token reuse. They train the behavior model using Adversarial Inverse Reinforcement Learning with a dynamic reward offset that automatically balances realism and robustness during training.
Key results
- Instance-centric representation enables viewpoint-invariant encoding and static token reuse
- Query-centric symmetric context encoder improves interaction modeling
- Adaptive reward transformation dynamically balances realism and robustness
- Outperforms baselines in positional accuracy, robustness, and simulation scalability
Why it matters
Accelerates large-scale traffic simulation and improves behavior model generalization, benefiting automated vehicle development and scenario generation.
Abstract
Scalable multi-agent driving simulation requires behavior models that are both realistic and computationally efficient. We address this by optimizing the behavior model that controls individual traffic participants. To improve efficiency, we adopt an instance-centric scene representation, where each traffic participant and map element is modeled in its own local coordinate frame. This design enables efficient, viewpoint- invariant scene encoding and allows static map tokens to be reused across simulation steps. To model interactions, we employ a query-centric symmetric context encoder with relative positional encodings between local frames. We use Adversarial Inverse Reinforcement Learning to learn the behavior model and propose an adaptive reward transformation that automatically balances robustness and realism during training. Experiments demonstrate that our approach scales efficiently with the number of tokens, significantly reducing training and inference times, while outperforming several agent-centric baselines in terms of positional accuracy and robustness.