Research Analyzer
← Back ICRA 2026

MMD-SLAM: Structure-Enhanced Multi-Meta Gaussian Distribution-Guided Visual SLAM

Hui Zhu,, Hongxing Zhou, Sixun Liu, and Chunmao Jiang,∗

PDF

AI summary

Key figure (auto-extracted from paper)
MMD-SLAM achieves state-of-the-art tracking and photorealistic mapping by guiding Multi-Meta Gaussians with structural priors from the Atlanta World assumption.
Visual SLAM 3D Gaussian Splatting Structure Enhancement Atlanta World Assumption Photorealistic Mapping Multi-Meta Gaussians

Problem

Existing 3D Gaussian Splatting-based Visual SLAM systems often overlook underlying structural regularities in man-made environments, resulting in inconsistent maps, blurred artifacts, and suboptimal rendering quality.

Approach

The framework fuses point and line features for robust pose tracking and introduces a Multi-Meta Gaussian representation that explicitly encodes structural priors, optimized via an adaptive evolution strategy and structure-aware loss functions.

Key results

  • 48.56% reduction in ATE RMSE on ScanNet
  • 5.71% PSNR improvement on Replica datasets
  • State-of-the-art tracking and mapping performance across real-world and synthetic benchmarks
  • Multi-Meta Gaussian evolution strategy for adaptive geometric fitting

Why it matters

Provides a robust, high-fidelity mapping solution essential for embodied AI, augmented reality, and autonomous robotics applications.

Abstract

3D Gaussian Splatting (3DGS) has significantly boosted novel view synthesis and high-fidelity scene reconstruc- tion, expanding the potential of 3DGS-based Visual Simultane- ous Localization and Mapping (SLAM) methods. However, most existing systems fail to fully exploit the underlying structural information, which limits rendering quality and often leads to inconsistent maps. To address these limitations, we propose MMD-SLAM, a structure-enhanced Visual SLAM framework that leverages the Atlanta World (AW) assumption to guide a Multi-Meta Gaussian representation for photorealistic map- ping. First, we introduce a point–line fusion strategy for pose optimization, where 3D line segments are incorporated to improve tracking robustness and provide additional con- straints for mapping. Second, we design a Multi-Meta Gaussian representation with dominant directions, explicitly encoding structural priors from the AW hypothesis. Finally, we propose a Gaussian evolution strategy that adapts to scene geometry and incorporates structural cues into global optimization. Extensive experiments demonstrate that these innovations enable MMD- SLAM to achieve state-of-the-art performance in both tracking accuracy and mapping quality. e.g., our method achieves a 48.56% reduction in ATE RMSE on ScanNet and a 5.71% improvement in PSNR on Replica, compared with MonoGS.

Index terms

Mapping SLAM Localization

Related papers