← Back ICRA 2026

Memory-Efficient Voxelized Renderable Neural 3D Spatial Representation for Vision-Based Robotics

Howoong Jun, Seongbo Ha, Jaewon Lee, Hyeonwoo Yu, Songhwai Oh

PDF

AI summary

Key figure (auto-extracted from paper)

3DSR achieves over 90% of state-of-the-art reconstruction quality while using only 54.54% of the memory, enabling high-fidelity neural rendering for memory-constrained robotics.

3D Gaussian splatting voxelization memory efficiency vision-based robotics neural rendering visual localization

Problem

Existing neural 3D rendering methods are too memory-intensive for real-time robotics, while traditional voxel maps lack the detailed visual fidelity robots need for perception and navigation.

Approach

The method compresses a dense 3D Gaussian splatting map through voxelization to create a sparse base representation, then uses a lightweight upsampling network to restore high-quality novel views from this compact map.

Key results

Reduces memory usage to 54.54% of the baseline while preserving over 90% reconstruction quality
Achieves a 98.25% average memory reduction compared to the original dense 3D Gaussian representation
Boosts visual localization accuracy by an average of 36.59% through rendered novel views
Provides effective data augmentation that improves data-driven visual navigation performance

Why it matters

Enables memory-constrained robots to deploy high-fidelity neural rendering for real-time localization, navigation, and training without hardware bottlenecks.

Abstract

In this paper, we introduce a novel approach for modeling a memory-efficient spatial representation with 3D Gaussian splatting. Efficient vision-based spatial representation poses a significant challenge due to the memory demands of visual information. Recent advances in 3D rendering technologies, such as neural radiance fields (NeRF) and 3D Gaussian splatting, have prompted exploration of their applications in robotics. However, such 3D rendering methods often focus on rendering high-quality images, requiring numerous parameters and resulting in large data, which are unsuitable for robotics applications. To tackle this challenge, we introduce 3DSR, an efficient voxelized render- able neural 3D spatial representation that utilizes 3D Gaussian splatting. 3DSR leverages the strengths of both voxelization (memory efficiency) and 3D Gaussian splatting (high-quality image reconstruction). The proposed method achieves memory efficiency by reducing the number of 3D Gaussians in the 3D representation through voxelization, while preserving the image quality required for effective vision-based robotic applications. Experimental results demonstrate that 3DSR achieves over 90% of the best method’s reconstruction quality while requiring only 54.54% of its memory. Additional experiments on visual localization and navigation further confirm that the proposed method is readily applicable to robotics.

Index terms

Visual Learning Computer Vision for Automation Vision-Based Navigation