DIAL-GS: Dynamic Instance Aware Reconstruction for Label-Free Street Scenes with 4D Gaussian Splatting
Chenpeng Su, Wenhua Wu, Chensheng Peng, Tianchen Deng, Zhe Liu, Hesheng Wang
AI summary
Problem
Self-supervised street scene reconstruction suffers from dynamic-static confusion and lacks instance-level awareness, hindering fine-grained editing and scalable data synthesis without costly manual annotations.
Approach
The method identifies dynamic instances by measuring appearance and position inconsistencies between warped renderings and ground truth, then unifies static and dynamic elements using instance-aware 4D Gaussians with a reciprocal identity-dynamics training loop.
Key results
- Superior image reconstruction and novel view synthesis on Waymo and KITTI
- Accurate static-dynamic separation without manual annotations
- Instance-level scene editing capability for self-supervised reconstruction
- Reduced dynamic-static confusion via appearance-position inconsistency scoring
Why it matters
Provides a scalable, annotation-free solution for high-fidelity urban scene modeling, directly benefiting autonomous driving simulation and closed-loop testing.
Abstract
Urban scene reconstruction is critical for au- tonomous driving, enabling structured 3D representations for data synthesis and closed-loop testing. Supervised approaches rely on costly human annotations and lack scalability, while current self-supervised methods often confuse static and dynamic elements and fail to distinguish individual dynamic objects, limiting fine-grained editing. We propose DIAL-GS, a novel dynamic instance-aware reconstruction method for label-free street scenes with 4D Gaussian Splatting. We first accurately identify dynamic instances by exploiting appearance–position inconsistency between warped rendering and actual obser- vations. Guided by instance-level dynamic perception, we employ instance-aware 4D Gaussians as the unified volumetric representation, realizing dynamic-adaptive and instance-aware reconstruction. Furthermore, we introduce a reciprocal mech- anism through which identity and dynamics reinforce each other, enhancing both integrity and consistency. Experiments on urban driving scenarios show that DIAL-GS surpasses existing self-supervised baselines in reconstruction quality and instance-level editing, offering a concise yet powerful solution for urban scene modeling. Our code and models are available at: https://github.com/IRMVLab/DIAL-GS.