← Back ICRA 2026

I2D-LocX: An Efficient, Precise and Robust Method for Camera Localization in LiDAR Maps

Huai Yu, Xubo Zhu, Shu Han, Wen Yang, Gui-Song Xia

PDF

AI summary

Key figure (auto-extracted from paper)

I2D-LocX achieves centimeter-level camera localization in LiDAR maps with only 37 ms inference time by using training-only auxiliary branches to boost accuracy without increasing model complexity.

Camera localization LiDAR maps cross-modal registration lightweight network real-time pose estimation flow-based matching

Problem

Existing camera localization methods in LiDAR maps often sacrifice computational efficiency for accuracy, hindering real-time deployment, while lightweight models suffer from insufficient constraints and LiDAR sparsity.

Approach

The framework predicts pixel-to-point flow maps using a lightweight network, augmented by two parameter-sharing auxiliary branches that provide zero-flow feature and confidence constraints during training but are discarded at inference to preserve speed.

Key results

Centimeter-level localization accuracy on KITTI, Argoverse, Waymo, and nuScenes
Fast 37 ms per-frame inference time suitable for real-time applications
Zero-flow feature branch effectively mitigates LiDAR point cloud sparsity
Confidence-weighted pixel loss dynamically improves matching robustness

Why it matters

Provides a practical, low-cost solution for real-time visual localization in autonomous driving and robotic navigation by balancing high precision with computational efficiency.

Abstract

Camera localization within LiDAR maps has gained significant attention due to its potential for accurate positioning with low-cost and lightweight sensors compared to LiDAR-based systems. However, existing methods often prioritize localization accuracy, sometimes compromising efficiency, which can limit their suitability for real-time applications. To address these is- sues, we propose I2D-LocX, a lightweight monocular camera lo- calization framework with three branches, establishing pixel-level and feature-level constraints to enhance localization performance without increasing model complexity. Specifically, the main branch generates a flow map to represent pixel-point displacements. One auxiliary branch shares the same input as the main branch and employs an additional decoder to evaluate the confidence of the flow map. The other auxiliary branch leverages a zero-flow gener- ated from the displacement-free input to guide feature matching, thereby enhancing localization robustness. Notably, both auxiliary branches share parameters with the main branch and are omit- ted during inference, ensuring computational efficiency. Extensive experiments on benchmark datasets, including KITTI-Odometry, Argoverse, Waymo, and nuScenes, show that I2D-LocX can achieve centimeter-level localization accuracy with about 37 ms inference time,greatlyimprovingthelocalizationperformanceforreal-world applications.

Index terms

Localization SLAM Sensor Fusion