MMD-SLAM: Structure-Enhanced Multi-Meta Gaussian Distribution-Guided Visual SLAM
Hui Zhu,, Hongxing Zhou, Sixun Liu, and Chunmao Jiang,∗
AI summary
Problem
Existing 3D Gaussian Splatting-based Visual SLAM systems often overlook underlying structural regularities in man-made environments, resulting in inconsistent maps, blurred artifacts, and suboptimal rendering quality.
Approach
The framework fuses point and line features for robust pose tracking and introduces a Multi-Meta Gaussian representation that explicitly encodes structural priors, optimized via an adaptive evolution strategy and structure-aware loss functions.
Key results
- 48.56% reduction in ATE RMSE on ScanNet
- 5.71% PSNR improvement on Replica datasets
- State-of-the-art tracking and mapping performance across real-world and synthetic benchmarks
- Multi-Meta Gaussian evolution strategy for adaptive geometric fitting
Why it matters
Provides a robust, high-fidelity mapping solution essential for embodied AI, augmented reality, and autonomous robotics applications.
Abstract
3D Gaussian Splatting (3DGS) has significantly boosted novel view synthesis and high-fidelity scene reconstruc- tion, expanding the potential of 3DGS-based Visual Simultane- ous Localization and Mapping (SLAM) methods. However, most existing systems fail to fully exploit the underlying structural information, which limits rendering quality and often leads to inconsistent maps. To address these limitations, we propose MMD-SLAM, a structure-enhanced Visual SLAM framework that leverages the Atlanta World (AW) assumption to guide a Multi-Meta Gaussian representation for photorealistic map- ping. First, we introduce a point–line fusion strategy for pose optimization, where 3D line segments are incorporated to improve tracking robustness and provide additional con- straints for mapping. Second, we design a Multi-Meta Gaussian representation with dominant directions, explicitly encoding structural priors from the AW hypothesis. Finally, we propose a Gaussian evolution strategy that adapts to scene geometry and incorporates structural cues into global optimization. Extensive experiments demonstrate that these innovations enable MMD- SLAM to achieve state-of-the-art performance in both tracking accuracy and mapping quality. e.g., our method achieves a 48.56% reduction in ATE RMSE on ScanNet and a 5.71% improvement in PSNR on Replica, compared with MonoGS.