Research Analyzer
← Back ICRA 2026

Prepare for Warp Speed: Sub-Millisecond Visual Place Recognition Using Event Cameras

Vignesh Ramanathan, Michael J Milford, Tobias Fischer

PDF

AI summary

Key figure (auto-extracted from paper)
Flash enables sub-millisecond visual place recognition using event cameras, dramatically improving recall and reducing localization latency for high-speed robots.
Event cameras Visual place recognition Sub-millisecond latency Binary frames High-speed robotics Neuromorphic vision

Problem

Existing event-camera VPR methods accumulate tens to hundreds of milliseconds of data to create dense representations, wasting the sensor's microsecond temporal resolution and causing high latency unsuitable for high-speed navigation.

Approach

Flash encodes active pixel locations from sub-millisecond event windows into sparse binary frames and computes place similarity using fast bitwise overlap operations, corrected by a lightweight normalization to prevent activity bias.

Key results

  • First event-camera VPR system to operate using sub-millisecond event windows
  • 11.33× and 5.92× Recall@1 improvements over baselines on indoor and outdoor datasets
  • Introduction of Time to Correct Match (TCM) metric to quantify localization latency
  • Efficient binary-frame representation with bitwise operations enabling ultra-low processing latency

Why it matters

Enables reliable, ultra-fast localization for high-speed autonomous robots and vehicles operating in GPS-denied or bandwidth-constrained environments.

Abstract

Visual Place Recognition (VPR) enables systems to identify previously visited locations within a map, a fundamental task for autonomous navigation. Prior works have developed VPR solutions using event cameras, which asynchronously measure per-pixel brightness changes with microsecond temporal resolution. However, these works rely on dense representations of the inherently sparse camera output and require tens to hundreds of milliseconds of event data to predict a place. Here, we break this paradigm with Flash, a lightweight VPR system that predicts places using sub-millisecond slices of event data. Our method is based on the observation that active pixel locations provide strong discriminative features for VPR. Flash encodes these active pixel locations using efficient binary frames and computes similarities via fast bitwise operations, which are then normalized based on the relative event activity in the query and reference frames. Flash improves Recall@1 for sub- millisecond VPR over existing baselines by 11.33× on the indoor QCR-Event-Dataset and 5.92× on the 8 km Brisbane-Event- VPR dataset. Moreover, our method reduces the duration for which the robot must operate without awareness of its position, as evidenced by a localization latency metric we term Time to Correct Match (TCM). To the best of our knowledge, this is the first work to demonstrate sub-millisecond event-based VPR.

Index terms

Localization

Related papers