← Back ICRA 2026

RelMap: Enhancing Online Map Construction with Class-Aware Spatial Relation and Semantic Priors

Tianhui Cai, Yun Zhang, Zewei Zhou, Zhiyu Huang, Jiaqi Ma

PDF

AI summary

Key figure (auto-extracted from paper)

Explicitly modeling spatial dependencies and class-specific semantics via learnable priors significantly boosts online HD map construction accuracy.

Online HD map construction Spatial relations Mixture-of-experts Vectorized mapping Autonomous driving Transformer decoder

Problem

Current Transformer-based online HD map construction methods treat map elements independently, ignoring crucial spatial and semantic relationships that limit prediction accuracy and generalization.

Approach

RelMap integrates a Class-aware Spatial Relation Prior to encode geometric dependencies between instances and a Mixture-of-Experts Semantic Prior that dynamically routes features to class-specific experts for refined decoding.

Key results

State-of-the-art performance on nuScenes and Argoverse 2 benchmarks
Seamless compatibility with single-frame and temporal perception backbones
Improved vectorized map prediction accuracy across lane dividers, crosswalks, and road boundaries
Elimination of separate routing networks in MoE, reducing model complexity and training overhead

Why it matters

Enables more scalable and accurate real-time HD map generation for autonomous vehicles by leveraging intrinsic map topology and semantics.

Abstract

Online high-definition (HD) map construction is crucial for scaling autonomous driving systems. While Transformer-based methods have become prevalent in online HD map construction, most existing approaches overlook the inherent spatial dependencies and semantic relationships be- tween map elements, which constrains their accuracy and generalization capabilities. To address this, we propose RelMap, an end-to-end framework that explicitly models both spatial relations and semantic priors to enhance online HD map construction. Specifically, we introduce a Class-aware Spatial Relation Prior, which explicitly encodes relative positional de- pendencies between map elements using a learnable class-aware relation encoder. Additionally, we design a Mixture-of-Experts- based Semantic Prior, which routes features to class-specific experts based on predicted class probabilities, refining instance feature decoding. RelMap is compatible with both single-frame and temporal perception backbones, achieving state-of-the-art performance on the nuScenes and Argoverse 2 datasets.

Index terms

Intelligent Transportation Systems Computer Vision for Automation Computer Vision for Transportation