← Back ICRA 2026

M2G-Net: Multimodal Mutual-Guidance Network for LiDAR Depth and Intensity Completion

Donghyun Choi, Sangmin Lee, Jee-Hwan Ryu

PDF

AI summary

Key figure (auto-extracted from paper)

M²G-Net jointly completes occluded LiDAR depth and intensity maps by leveraging their mutual structural and attention-based guidance, significantly outperforming existing baselines.

LiDAR completion depth-intensity fusion mutual guidance occlusion inpainting autonomous driving SLAM

Problem

Dynamic objects in urban environments occlude LiDAR scans, degrading SLAM and perception accuracy. Prior methods focus primarily on depth completion while neglecting intensity, overlooking the complementary relationship between the two modalities that could enhance reconstruction fidelity.

Approach

The authors propose a dual-branch network with a Multimodal Mutual-Guidance (M2G) module that enables symmetric, bidirectional feature interaction between depth and intensity using structure-aware guidance and spatial-channel attention, eliminating the need for external RGB images.

Key results

Jointly reconstructs occluded depth and intensity from raw LiDAR data
Achieves 35–40% lower MAE for depth and improved intensity metrics over baselines
Preserves 3D geometric consistency and structural integrity in completed point clouds
Operates at ~26.6 Hz inference speed with 13.87M parameters for real-time deployment

Why it matters

Provides a robust, intensity-aware completion framework that enhances the reliability and applicability of LiDAR-based perception and SLAM in dynamic autonomous driving scenarios.

Abstract

Autonomous driving has rapidly advanced with di- verse sensors, especially Light Detection and Ranging (LiDAR), which provides precise geometry for tasks like simultaneous localization and mapping (SLAM). Recently, the performance of LiDAR-based SLAM has improved through studies leveraging intensity as a complementary cue to depth. However, in urban environments, dynamic objects occlude static scenes, degrading the stability and accuracy of LiDAR-based SLAM. While previous studies have focused mainly on completing occluded depth, they often disregard intensity, assuming it to be less critical or difficult to estimate due to inherent noise. This overlooks the strong complementary relationship between the two modalities, which can be exploited for effective multimodal completion. Furthermore, completing intensity alongside depth enables broader applicability to intensity-aware perception tasks. To address this issue, a Multimodal Mutual-Guidance (M2G) module is proposed for the joint completion of occluded depth and intensity in LiDAR data. M2G is integrated into a deep learning-based network that takes projected range and intensity images as input, enabling progressive cross-modal feature interaction. Leveraging the shared origin of LiDAR depth and intensity, M2G balances noisy intensity and smooth depth via attention and structure-aware guidance. Experimen- tal results demonstrate that the proposed method outperforms existing inpainting and depth completion approaches, validating its effectiveness for LiDAR completion.

Index terms

Deep Learning for Visual Perception