Research Analyzer
← Back ICRA 2026

PSKDNet: Position-Supervised Keypoints Diffusion Network for Online Vectorized HD Map Construction

Mingkun Jiang, Jun Dong, JunMing He, guangyu Hou, Fan Ma, Shuang Wu, Yujing Zhang

PDF

AI summary

Key figure (auto-extracted from paper)
PSKDNet leverages keypoints diffusion modeling and position-supervised transformers to significantly boost the accuracy and generalization of online HD map construction under domain shifts.
Online HD Map Construction Diffusion Models Vectorized Mapping Domain Generalization Autonomous Driving Keypoint Prediction

Problem

Online high-definition map construction degrades significantly when deployed in novel geographical regions with limited training data, as traditional deterministic methods lack robust supervisory signals to handle domain shifts.

Approach

The authors introduce PSKDNet, which applies a keypoints diffusion model to iteratively denoise map element predictions for richer supervision, combined with a Progressive Position Relation Transformer that dynamically learns and enforces spatial relationships between road elements.

Key results

  • First successful application of keypoints diffusion to vectorized map construction
  • Achieves 68.4% mAP on nuScenes and 68.5% mAP on Argoverse2, surpassing state-of-the-art methods
  • Delivers +8.2 mAP and +5.3 mAP gains on geographically-disjoint splits, proving strong domain generalization
  • PPRT module preserves geometric structural integrity under high noise, validated by specialized spatial metrics

Why it matters

Provides a robust, data-efficient framework for autonomous vehicles to build accurate HD maps in unseen environments, advancing scalable perception systems.

Abstract

Online high-definition map construction repre- sents a critical challenge in autonomous driving systems. Exist- ing approaches suffer from generalization degradation when confronted with domain shifts across different geographical regions, particularly when facing limited training data in novel scenarios. To address this issue, we propose PSKDNet, a Position-Supervised Keypoints Diffusion Network that applies keypoints diffusion models to vectorized map for the first time. Our approach introduces Spatio-Temporal Keypoints Diffusion, which provides additional supervisory information through the diffusion process, thereby enhancing model generalization under domain shifts.To ensure accurate supervision of spatial relationships between map elements, we propose the Progressive Position Relation Transformer, which employs pointset simi- larity networks to obtain learnable position masks for explic- itly supervising spatial relationships between map elements. Extensive experiments on nuScenes and Argoverse datasets demonstrate that PSKDNet achieves superior performance over state-of-the-art methods, with significant improvements in detection accuracy and robustness to environmental variations. To the best of our knowledge, this work represents the first successful application of diffusion models to vectorized map construction, opening new research directions in autonomous driving perception systems.

Index terms

Autonomous Vehicle Navigation Deep Learning for Visual Perception Recognition

Related papers