← Back ICRA 2026

Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation

Fabian Konstantinidis, Moritz Sackmann, Ulrich Franz Hofmann, Christoph Stiller

PDF

AI summary

Key figure (auto-extracted from paper)

An instance-centric scene representation combined with adaptive reward transformation enables scalable, robust, and realistic multi-agent driving simulation with significantly reduced computational costs.

multi-agent simulation behavior modeling instance-centric representation adversarial inverse reinforcement learning automated driving computational efficiency

Problem

Multi-agent driving simulation requires behavior models that are both computationally efficient and robust to diverse scenarios, but existing approaches struggle with scaling or pose-invariance. Additionally, training these models via reinforcement learning often fails to balance realism and robustness due to fixed reward scaling.

Approach

The authors encode each traffic participant and map element in its own local coordinate frame to enable shared, viewpoint-invariant feature extraction and token reuse. They train the behavior model using Adversarial Inverse Reinforcement Learning with a dynamic reward offset that automatically balances realism and robustness during training.

Key results

Instance-centric representation enables viewpoint-invariant encoding and static token reuse
Query-centric symmetric context encoder improves interaction modeling
Adaptive reward transformation dynamically balances realism and robustness
Outperforms baselines in positional accuracy, robustness, and simulation scalability

Why it matters

Accelerates large-scale traffic simulation and improves behavior model generalization, benefiting automated vehicle development and scenario generation.

Abstract

Scalable multi-agent driving simulation requires behavior models that are both realistic and computationally efficient. We address this by optimizing the behavior model that controls individual traffic participants. To improve efficiency, we adopt an instance-centric scene representation, where each traffic participant and map element is modeled in its own local coordinate frame. This design enables efficient, viewpoint- invariant scene encoding and allows static map tokens to be reused across simulation steps. To model interactions, we employ a query-centric symmetric context encoder with relative positional encodings between local frames. We use Adversarial Inverse Reinforcement Learning to learn the behavior model and propose an adaptive reward transformation that automatically balances robustness and realism during training. Experiments demonstrate that our approach scales efficiently with the number of tokens, significantly reducing training and inference times, while outperforming several agent-centric baselines in terms of positional accuracy and robustness.

Index terms

Learning from Demonstration Intelligent Transportation Systems Multi-Robot Systems