Research Analyzer
← Back ICRA 2026

DVMM: A Dual-View Combination Descriptor for Multi-Modal LiDARs Online Place Recognition

Xuzhe Duan, Qingwu Hu, Mingyao Ai, Pengcheng Zhao, Jiayuan Li

PDF

AI summary

Key figure (auto-extracted from paper)
DVMM enables robust, modality-invariant online place recognition across diverse LiDAR sensors by combining an azimuthal descriptor for coarse detection with a cross-section descriptor for fine verification.
Place recognition multi-modal LiDAR collaborative SLAM loop closure detection descriptor design robotics

Problem

Existing place recognition descriptors for single-agent SLAM fail to handle the inherent differences (scanning density, range, mounting height, HFOV, VFOV) across multi-modal LiDARs in collaborative SLAM systems.

Approach

The method projects point clouds onto an adaptive grid to generate a 1D azimuthal descriptor for coarse loop candidate retrieval, then verifies candidates using a binary cross-section occupancy image encoded from a fixed height range.

Key results

  • Significantly outperforms state-of-the-art descriptors on public and real-world multi-modal LiDAR datasets
  • Robustly handles variations in HFOV, VFOV, mounting height, and point density across seven different LiDAR sensors
  • Achieves accurate coarse-to-fine loop closure detection with simultaneous 4-DoF relative pose estimation
  • Seamlessly integrates into collaborative SLAM frameworks for cross-agent localization

Why it matters

It enables multi-robot systems to reliably share localization data across heterogeneous LiDAR hardware, advancing robust collaborative mapping in diverse environments.

Abstract

Existing place recognition descriptors developed for single-agent SLAM struggle with multi-modal LiDAR differences in collaborative SLAM. To overcome this, we propose an online place recognition method for multi-modal LiDARs. This method introduces a dual-view combination descriptor, termed DVMM, by separately encoding azimuthal and vertical scene information. The place recognition process consists of two stages: loop closure detection and verification. In the detection stage, point clouds are projected onto an adaptive grid and a 1D azimuthal descriptor is generated via Gaussian-weighted column summation. The az- imuthal descriptor is utilized to retrieve loop candidates through vector matching. In the verification stage, point clouds within a fixed height range are encoded as a binary occupancy image, which serves as the cross-section descriptor. Accurate loop closures are determined by performing image matching on the cross-section descriptors. We evaluate the proposed method on both public and real-world datasets encompassing a total of seven LiDAR sensors. The results demonstrate that DVMM significantly outperforms state-of-the-art descriptors in handling multi-modal LiDAR data and is compatible with collaborative SLAM systems.

Index terms

SLAM Multi-Robot SLAM Localization

Related papers