HiMAP: History-Aware Map-Occupancy Prediction with Fallback
Yiming Xu, Yi Yang, Hao Cheng, Monika Sester
AI summary
Problem
Most trajectory prediction models depend on stable multi-object tracking with consistent identities, but tracking frequently fails due to occlusion or ID switches, causing prediction accuracy and safety to degrade. The paper addresses the gap of robust motion forecasting in the complete absence of reliable tracking identities.
Approach
HiMAP converts past detections into spatiotemporally invariant historical occupancy maps and uses a query module to iteratively retrieve agent-specific history from these unlabeled representations. This reconstructed history, combined with current state and map context, drives a DETR-style decoder to generate multi-modal future trajectories without requiring tracking IDs.
Key results
- Achieves performance comparable to tracking-based methods on Argoverse 2 without using identities
- Outperforms fine-tuned QCNet baseline in no-tracking settings with 11% FDE, 12% ADE gains and 4% MR reduction
- Delivers stable simultaneous forecasts for all agents without waiting for tracking recovery
- Enables efficient streaming inference through reusable spatiotemporally invariant encodings
Why it matters
It provides a critical safety fallback for autonomous driving systems, ensuring reliable motion forecasting and decision-making even when perception tracking modules fail.
Abstract
Accurate motion forecasting is critical for au- tonomous driving, yet most predictors rely on multi-object tracking (MOT) with identity association, assuming that objects are correctly and continuously tracked. When tracking fails due to, e.g., occlusion, identity switches, or missed detections, pre- diction quality degrades and safety risks increase. We present HiMAP, a tracking-free, trajectory prediction framework that remains reliable under MOT failures. HiMAP converts past detections into spatiotemporally invariant historical occupancy maps and introduces a historical query module that condi- tions on the current agent state to iteratively retrieve agent- specific history from unlabeled occupancy representations. The retrieved history is summarized by a temporal map embedding and, together with the final query and map context, drives a DETR-style decoder to produce multi-modal future trajectories. This design lifts identity reliance, supports streaming inference via reusable encodings, and serves as a robust fallback when tracking is unavailable. On Argoverse 2, HiMAP achieves performance comparable to tracking-based methods while op- erating without IDs, and it substantially outperforms strong baselines in the no-tracking setting, yielding relative gains of 11% in FDE, 12% in ADE, and a 4% reduction in MR over a fine-tuned QCNet. Beyond aggregate metrics, HiMAP delivers stable forecasts for all agents simultaneously without waiting for tracking to recover, highlighting its practical value for safety-critical autonomy. The code is available under: https: //github.com/XuYiMing83/HiMAP.