SEPT: Standard-Definition Map Enhanced Scene Perception and Topology Reasoning for Autonomous Driving
Muleilan Pei, Jiayao Shan, Peiliang Li, Jieqi Shi, Jing Huo, Yang Gao, Shaojie Shen
AI summary
Problem
Mapless driving systems struggle with long-range visibility and occlusions due to onboard sensor limits, while existing methods fail to effectively align and fuse standard-definition map priors with visual features.
Approach
SEPT combines rasterized and vectorized standard-definition map features with Bird’s-Eye-View representations using a dual-gated fusion module and adaptive alignment, plus an auxiliary intersection detection task.
Key results
- Hybrid rasterized and vectorized SD map encoding with adaptive feature alignment
- Dual-gated feature fusion module for seamless BEV feature augmentation
- Auxiliary intersection-aware keypoint detection task for enhanced topology reasoning
- State-of-the-art performance on OpenLane-V2 with significant gains in perception and topology metrics
Why it matters
Enables scalable, low-cost mapless driving by leveraging widely available navigation maps to overcome sensor limitations in complex urban environments.
Abstract
Online scene perception and topology reasoning are critical for autonomous vehicles to understand their driving environments, particularly for mapless driving systems that endeavor to reduce reliance on costly High-Definition (HD) maps. However, recent advances in online scene understand- ing still face limitations, especially in long-range or occluded scenarios, due to the inherent constraints of onboard sensors. To address this challenge, we propose a Standard-Definition (SD) map Enhanced scene Perception and Topology reasoning (SEPT) framework, which explores how to effectively incorporate the SD map as prior knowledge into existing perception and reasoning pipelines. Specifically, we introduce a novel hybrid feature fusion strategy that combines SD maps with Bird’s-Eye- View (BEV) features, considering both rasterized and vectorized representations, while mitigating potential misalignment between SD maps and BEV feature spaces. Additionally, we leverage the SD map characteristics to design an auxiliary intersection- aware keypoint detection task, which further enhances the overall scene understanding performance. Experimental results on the large-scale OpenLane-V2 dataset demonstrate that by effectively integrating SD map priors, our framework significantly improves both scene perception and topology reasoning, outperforming existing methods by a substantial margin.