Memory-Efficient Voxelized Renderable Neural 3D Spatial Representation for Vision-Based Robotics
Howoong Jun, Seongbo Ha, Jaewon Lee, Hyeonwoo Yu, Songhwai Oh
AI summary
Problem
Existing neural 3D rendering methods are too memory-intensive for real-time robotics, while traditional voxel maps lack the detailed visual fidelity robots need for perception and navigation.
Approach
The method compresses a dense 3D Gaussian splatting map through voxelization to create a sparse base representation, then uses a lightweight upsampling network to restore high-quality novel views from this compact map.
Key results
- Reduces memory usage to 54.54% of the baseline while preserving over 90% reconstruction quality
- Achieves a 98.25% average memory reduction compared to the original dense 3D Gaussian representation
- Boosts visual localization accuracy by an average of 36.59% through rendered novel views
- Provides effective data augmentation that improves data-driven visual navigation performance
Why it matters
Enables memory-constrained robots to deploy high-fidelity neural rendering for real-time localization, navigation, and training without hardware bottlenecks.
Abstract
In this paper, we introduce a novel approach for modeling a memory-efficient spatial representation with 3D Gaussian splatting. Efficient vision-based spatial representation poses a significant challenge due to the memory demands of visual information. Recent advances in 3D rendering technologies, such as neural radiance fields (NeRF) and 3D Gaussian splatting, have prompted exploration of their applications in robotics. However, such 3D rendering methods often focus on rendering high-quality images, requiring numerous parameters and resulting in large data, which are unsuitable for robotics applications. To tackle this challenge, we introduce 3DSR, an efficient voxelized render- able neural 3D spatial representation that utilizes 3D Gaussian splatting. 3DSR leverages the strengths of both voxelization (memory efficiency) and 3D Gaussian splatting (high-quality image reconstruction). The proposed method achieves memory efficiency by reducing the number of 3D Gaussians in the 3D representation through voxelization, while preserving the image quality required for effective vision-based robotic applications. Experimental results demonstrate that 3DSR achieves over 90% of the best method’s reconstruction quality while requiring only 54.54% of its memory. Additional experiments on visual localization and navigation further confirm that the proposed method is readily applicable to robotics.