← Back ICRA 2026

ForSim: Stepwise Forward Simulation for Traffic Policy Fine-Tuning

Keyu Chen, Wenchao Sun, Hao Cheng, Zheng Fu, Sifa Zheng

PDF

AI summary

Key figure (auto-extracted from paper)

ForSim's stepwise closed-loop simulation preserves multimodal driving diversity and physical plausibility, consistently improving safety and realism in traffic policy fine-tuning.

Traffic simulation Forward simulation Closed-loop learning Multimodal driving Policy fine-tuning Autonomous driving

Problem

Traffic simulation struggles with covariate shift from open-loop imitation learning and fails to capture multi-modal behaviors and realistic inter-agent interactions during forward rollout, limiting policy fine-tuning fidelity.

Approach

ForSim unrolls candidate trajectories stepwise in a virtual domain, dynamically selecting the closest matching trajectory at each timestep via a PID controller and kinematic bicycle model, while updating other agents' predictions at every step to enable closed-loop, interaction-aware evolution.

Key results

Proposes ForSim, a stepwise closed-loop forward simulation paradigm
Introduces Trajectory-Aligned Rollout to preserve multimodal fidelity
Implements stepwise prediction for other agents to capture dynamic interactions
Consistently improves safety, realism, and comfort when integrated with RIFT

Why it matters

Enhances the fidelity and reliability of traffic simulation for autonomous driving by modeling closed-loop multimodal interactions, benefiting researchers and engineers developing safe, realistic driving policies.

Abstract

As the foundation of closed-loop training and evaluation in autonomous driving, traffic simulation still faces two fundamental challenges: covariate shift introduced by open- loop imitation learning and limited capacity to reflect the multi- modal behaviors observed in real-world traffic. Although recent frameworks such as RIFT have partially addressed these issues through group-relative optimization, their forward simulation procedures remain largely non-reactive, leading to unrealistic agent interactions within the virtual domain and ultimately limiting simulation fidelity. To address these issues, we propose ForSim, a stepwise closed-loop forward simulation paradigm. At each virtual timestep, the traffic agent propagates the virtual candidate trajectory that best spatiotemporally matches the reference trajectory through physically grounded motion dynam- ics, thereby preserving multimodal behavioral diversity while ensuring intra-modality consistency. Other agents are updated with stepwise predictions, yielding coherent and interaction- aware evolution. When incorporated into the RIFT traffic simulation framework, ForSim operates in conjunction with group-relative optimization to fine-tune traffic policy. Extensive experiments confirm that this integration consistently improves safety while maintaining efficiency, realism, and comfort. These results underscore the importance of modeling closed-loop multimodal interactions within forward simulation and enhance the fidelity and reliability of traffic simulation for autonomous driving. Project Page: https://currychen77.github.io/ForSim/

Index terms

Autonomous Agents Motion and Path Planning Reinforcement Learning