PSKDNet: Position-Supervised Keypoints Diffusion Network for Online Vectorized HD Map Construction
Mingkun Jiang, Jun Dong, JunMing He, guangyu Hou, Fan Ma, Shuang Wu, Yujing Zhang
AI summary
Problem
Online high-definition map construction degrades significantly when deployed in novel geographical regions with limited training data, as traditional deterministic methods lack robust supervisory signals to handle domain shifts.
Approach
The authors introduce PSKDNet, which applies a keypoints diffusion model to iteratively denoise map element predictions for richer supervision, combined with a Progressive Position Relation Transformer that dynamically learns and enforces spatial relationships between road elements.
Key results
- First successful application of keypoints diffusion to vectorized map construction
- Achieves 68.4% mAP on nuScenes and 68.5% mAP on Argoverse2, surpassing state-of-the-art methods
- Delivers +8.2 mAP and +5.3 mAP gains on geographically-disjoint splits, proving strong domain generalization
- PPRT module preserves geometric structural integrity under high noise, validated by specialized spatial metrics
Why it matters
Provides a robust, data-efficient framework for autonomous vehicles to build accurate HD maps in unseen environments, advancing scalable perception systems.
Abstract
Online high-definition map construction repre- sents a critical challenge in autonomous driving systems. Exist- ing approaches suffer from generalization degradation when confronted with domain shifts across different geographical regions, particularly when facing limited training data in novel scenarios. To address this issue, we propose PSKDNet, a Position-Supervised Keypoints Diffusion Network that applies keypoints diffusion models to vectorized map for the first time. Our approach introduces Spatio-Temporal Keypoints Diffusion, which provides additional supervisory information through the diffusion process, thereby enhancing model generalization under domain shifts.To ensure accurate supervision of spatial relationships between map elements, we propose the Progressive Position Relation Transformer, which employs pointset simi- larity networks to obtain learnable position masks for explic- itly supervising spatial relationships between map elements. Extensive experiments on nuScenes and Argoverse datasets demonstrate that PSKDNet achieves superior performance over state-of-the-art methods, with significant improvements in detection accuracy and robustness to environmental variations. To the best of our knowledge, this work represents the first successful application of diffusion models to vectorized map construction, opening new research directions in autonomous driving perception systems.