Research Analyzer
← Back ICRA 2026

HE-VPR: Height Estimation Enabled Aerial Visual Place Recognition Against Scale Variance

Mengfan He, Xingyu Shao, Chunyu Li, Chao Chen, Liangzheng Sun, Ziyang Meng, YuanQing Wu

PDF

AI summary

Key figure (auto-extracted from paper)
HE-VPR decouples height estimation from place recognition using parallel bypass adapters on a frozen backbone, boosting Recall@1 by 6.1% and cutting memory by 90% under severe aerial scale variations.
Aerial VPR Height Estimation Visual Place Recognition Bypass Adapters Scale Variance UAV Navigation

Problem

Aerial visual place recognition struggles with severe scale variations caused by changing flight altitudes, making full-database retrieval computationally prohibitive and memory-intensive for resource-constrained UAVs.

Approach

The framework uses a shared frozen DINOv2 backbone with two parallel bypass adapters to first retrieve the query's height partition, then perform place recognition within a corresponding height-specific sub-database, enhanced by a center-weighted masking strategy to mitigate residual scale differences.

Key results

  • Up to 6.1% Recall@1 improvement over ViT-based baselines
  • Up to 90% reduction in memory usage
  • Decoupled height estimation and place recognition pipeline
  • Center-weighted masking strategy for residual scale robustness

Why it matters

Enables practical, memory-efficient UAV navigation and localization in GNSS-denied environments with highly variable flight altitudes.

Abstract

In this work, we propose HE-VPR, a visual place recognition (VPR) framework that incorporates height estima- tion. Our system decouples height inference from place recog- nition, allowing both modules to share a frozen DINOv2 back- bone. Two lightweight bypass adapter branches are integrated into our system. The first estimates the height partition of the query image via retrieval from a compact height database, and the second performs VPR within the corresponding height- specific sub-database. The adaptation design reduces training cost and significantly decreases the search space of the database. We also adopt a center-weighted masking strategy to further enhance the robustness against scale differences. Experiments on two self-collected challenging multi-altitude datasets demon- strate that HE-VPR achieves up to 6.1% Recall@1 improve- ment over state-of-the-art ViT-based baselines and reduces memory usage by up to 90%. These results indicate that HE- VPR offers a scalable and efficient solution for height-aware aerial VPR, enabling practical deployment in GNSS-denied environments. All the code and datasets for this work have been released on https://github.com/hmf21/HE-VPR.

Index terms

Aerial Systems: Perception and Autonomy Deep Learning for Visual Perception

Related papers