ForSim: Stepwise Forward Simulation for Traffic Policy Fine-Tuning
Keyu Chen, Wenchao Sun, Hao Cheng, Zheng Fu, Sifa Zheng
AI summary
Problem
Traffic simulation struggles with covariate shift from open-loop imitation learning and fails to capture multi-modal behaviors and realistic inter-agent interactions during forward rollout, limiting policy fine-tuning fidelity.
Approach
ForSim unrolls candidate trajectories stepwise in a virtual domain, dynamically selecting the closest matching trajectory at each timestep via a PID controller and kinematic bicycle model, while updating other agents' predictions at every step to enable closed-loop, interaction-aware evolution.
Key results
- Proposes ForSim, a stepwise closed-loop forward simulation paradigm
- Introduces Trajectory-Aligned Rollout to preserve multimodal fidelity
- Implements stepwise prediction for other agents to capture dynamic interactions
- Consistently improves safety, realism, and comfort when integrated with RIFT
Why it matters
Enhances the fidelity and reliability of traffic simulation for autonomous driving by modeling closed-loop multimodal interactions, benefiting researchers and engineers developing safe, realistic driving policies.
Abstract
As the foundation of closed-loop training and evaluation in autonomous driving, traffic simulation still faces two fundamental challenges: covariate shift introduced by open- loop imitation learning and limited capacity to reflect the multi- modal behaviors observed in real-world traffic. Although recent frameworks such as RIFT have partially addressed these issues through group-relative optimization, their forward simulation procedures remain largely non-reactive, leading to unrealistic agent interactions within the virtual domain and ultimately limiting simulation fidelity. To address these issues, we propose ForSim, a stepwise closed-loop forward simulation paradigm. At each virtual timestep, the traffic agent propagates the virtual candidate trajectory that best spatiotemporally matches the reference trajectory through physically grounded motion dynam- ics, thereby preserving multimodal behavioral diversity while ensuring intra-modality consistency. Other agents are updated with stepwise predictions, yielding coherent and interaction- aware evolution. When incorporated into the RIFT traffic simulation framework, ForSim operates in conjunction with group-relative optimization to fine-tune traffic policy. Extensive experiments confirm that this integration consistently improves safety while maintaining efficiency, realism, and comfort. These results underscore the importance of modeling closed-loop multimodal interactions within forward simulation and enhance the fidelity and reliability of traffic simulation for autonomous driving. Project Page: https://currychen77.github.io/ForSim/