← Back ICRA 2026

ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments

Guile Wu, Dongfeng Bai, Bingbing Liu

PDF

AI summary

Key figure (auto-extracted from paper)

ArmGS achieves state-of-the-art photorealistic dynamic urban scene modeling and real-time rendering by refining composite 3D Gaussians across local, global, and actor granularities.

3D Gaussian Splatting Autonomous Driving Simulation Dynamic Scene Modeling Novel View Synthesis Appearance Refinement Real-time Rendering

Problem

Existing 3D Gaussian splatting methods for autonomous driving neglect fine-grained appearance variations across frames and camera viewpoints, resulting in lost details and suboptimal simulation quality.

Approach

The authors introduce ArmGS, which applies a multi-level appearance refinement scheme to optimize transformation parameters for composite 3D Gaussians at local, global, and dynamic actor levels.

Key results

First method to explicitly embed multi-granularity appearances for dynamic urban scenes
Achieves state-of-the-art reconstruction metrics (PSNR 38.1, SSIM 0.957) on Waymo
Enables real-time rendering while preserving splatting differentiability
Demonstrates superior novel view synthesis across Waymo, KITTI, NOTR, and VKITTI2

Why it matters

It enables high-fidelity, real-time driving simulation critical for validating autonomous driving safety in challenging corner cases.

Abstract

This work focuses on modeling dynamic urban en- vironments for autonomous driving simulation. Contemporary data-driven methods using neural radiance fields have achieved photorealistic driving scene modeling, but they suffer from low rendering efficacy. Recently, some approaches have explored 3D Gaussian splatting for modeling dynamic urban scenes, enabling high-fidelity reconstruction and real-time rendering. However, these approaches often neglect to model fine-grained variations between frames and camera viewpoints, leading to suboptimal results. In this work, we propose a new ap- proach named ArmGS that exploits composite driving Gaussian splatting with multi-granularity appearance refinement for autonomous driving scene modeling. The core idea of our approach is devising a multi-level appearance modeling scheme to optimize a set of transformation parameters for composite Gaussian refinement from multiple granularities, ranging from local Gaussian level to global image level and dynamic actor level. This not only models global scene appearance variations between frames and camera viewpoints, but also models local fine-grained changes of background and objects. Extensive ex- periments on multiple challenging autonomous driving datasets, namely, Waymo, KITTI, NOTR and VKITTI2, demonstrate the superiority of our approach over the state-of-the-art methods.

Index terms

Simulation and Animation Deep Learning for Visual Perception Computer Vision for Transportation