Research Analyzer
← Back ICRA 2026

SAGA-SLAM: Scale-Adaptive 3D Gaussian Splatting for Visual SLAM

Kun Park, Seung-Woo Seo

PDF

AI summary

Key figure (auto-extracted from paper)
SAGA-SLAM enables robust, scale-adaptive 3D Gaussian Splatting SLAM across diverse environments without manual hyperparameter tuning.
3D Gaussian Splatting Visual SLAM Scale Adaptation Polyak Step Size Gaussian Fission RGB-D Perception

Problem

Existing 3DGS-based SLAM methods fail to adapt to varying environmental scales and camera speeds, requiring manual learning rate tuning and suffering from insufficient Gaussian allocation in large-scale scenes.

Approach

The framework replaces fixed learning rates with a Polyak step size combined with momentum for automatic scale adaptation, and introduces a gaussian fission technique to split oversized Gaussians during densification.

Key results

  • State-of-the-art tracking and mapping on KITTI, Replica, and TUM-RGBD
  • Eliminates manual learning rate tuning across different scales
  • Prevents localization failure in large outdoor environments
  • Maintains high rendering quality without hyperparameter adjustments

Why it matters

Provides a robust, out-of-the-box SLAM solution for autonomous robots and AR systems navigating environments of vastly different sizes.

Abstract

3D Gaussian Splatting (3DGS) has recently emerged as a powerful technique for representing 3D scenes. Its superior high-fidelity rendering quality and speed have driven its rapid adoption in many applications. Among them, Visual Simultane- ous Localization and Mapping (VSLAM) is the most prominent application, as it requires real-time simultaneous mapping and position tracking of navigating objects. However, from our comprehensive study, we observed a fundamental hurdle in directly applying the current 3DGS technique to VSLAM, which we define as the scale adaptation problem. The scale adaptation problem refers to the inability of existing 3DGS-based SLAM methods to address varying scales, specifically the extent of camera pose difference from the perspective of tracking, and environmental size in terms of mapping and the addition of new 3D Gaussians. To overcome this limitation, we propose SAGA- SLAM, the first scale-adaptive RGB-D Dense SLAM framework based on 3DGS. We optimize the tracking and mapping stages robustly over various scales by utilizing the Polyak step size and momentum. Additionally, we present gaussian fission method to address the scale problem during the addition of 3D Gaussians. Experiments show that our method achieves state-of-the-art results robustly on both large and small scales, such as KITTI, Replica, and TUM-RGBD. By adapting without the need for hyperparameter tuning, our method demonstrates both superior performance and practical applicability.

Index terms

SLAM RGB-D Perception

Related papers