← Back ICRA 2026

GSRender: Deduplicated Occupancy Estimation Via Weakly Supervised 3D Gaussian Splatting

Qianpu Sun, Sifan Zhou, shu changyong, Sirui Han, Chun Yuan

PDF

AI summary

Key figure (auto-extracted from paper)

GSRender achieves state-of-the-art weakly-supervised 3D occupancy estimation by replacing NeRF sampling with 3D Gaussian Splatting and eliminating duplicate predictions via temporal ray compensation.

3D Gaussian Splatting Weakly-Supervised Learning Occupancy Estimation Autonomous Driving Ray Compensation Scene Reconstruction

Problem

Existing weakly-supervised occupancy methods rely on NeRF, which forces a difficult trade-off between sampling efficiency and accuracy, and suffers from duplicated predictions when rendering from single-view inputs.

Approach

The method models the scene as a collection of 3D Gaussians to simplify rendering, aligns features across adjacent video frames using a Ray Compensation module, and applies a dynamic-object-aware loss to prevent motion artifacts.

Key results

Replaces NeRF sampling with 3D Gaussian Splatting for efficient occupancy rendering
Introduces Ray Compensation module to eliminate duplicate predictions across frames
Achieves state-of-the-art RayIoU (+6.0) among weakly-supervised methods
Significantly narrows the performance gap with fully 3D-supervised approaches

Why it matters

Provides a more accurate and practical 3D scene understanding pipeline for autonomous driving that reduces reliance on expensive 3D ground-truth labels.

Abstract

Weakly-supervised 3D occupancy perception is crucial for vision-based autonomous driving in outdoor environ- ments. Previous methods based on NeRF often face a challenge in balancing the number of samples used. Too many samples can decrease efficiency, while too few can compromise accuracy, leading to variations in the mean Intersection over Union (mIoU) by 5-10 points. Furthermore, even with surrounding- view image inputs, only a single image is rendered from each viewpoint at any given moment. This limitation leads to dupli- cated predictions, which significantly impacts the practicality of the approach. However, this issue has largely been overlooked in existing research. To address this, we propose GSRender, which uses 3D Gaussian Splatting for weakly-supervised occupancy estimation, simplifying the sampling process. Additionally, we introduce the Ray Compensation module, which reduces dupli- cated predictions by compensating for features from adjacent frames. Finally, we redesign the dynamic loss to remove the influence of dynamic objects from adjacent frames. Extensive experiments show that our approach achieves SOTA results in RayIoU (+6.0), while also narrowing the gap with 3D- supervised methods. This work lays a solid foundation for weakly-supervised occupancy perception. The code is available at https://github.com/Jasper-sudo-Sun/GSRender.

Index terms

Data Sets for Robotic Vision Recognition Visual Learning