← Back ICRA 2026

Ensemble-Based Event Camera Place Recognition under Varying Illumination

Therese Joseph, Tobias Fischer, Michael J Milford

PDF

AI summary

Key figure (auto-extracted from paper)

Fusing diverse event reconstructions, feature extractors, and temporal resolutions via late score fusion yields up to 77% relative gain in place recognition accuracy during day-night transitions.

Event cameras Visual place recognition Ensemble learning Illumination robustness Autonomous navigation Sequence matching

Problem

Event-based visual place recognition struggles under severe illumination changes, and prior ensemble methods only fuse across temporal resolutions, leaving reconstruction and feature diversity underexplored.

Approach

The authors aggregate similarity scores from multiple event-to-frame reconstruction methods, state-of-the-art VPR feature extractors, and varying temporal resolutions, combined with a modified sequence matching algorithm that adapts to dynamic history lengths.

Key results

Up to 77% relative improvement in Recall@1 across day-night transitions
Comprehensive ablation of binning strategies, reconstruction methods, and feature extractors
Modified sequence matching with dynamic history length and z-score normalization
Validation on two long-term driving datasets without metric subsampling

Why it matters

Provides a robust, hardware-efficient solution for long-term localization and loop closure in autonomous vehicles operating under extreme lighting conditions.

Abstract

Compared to conventional cameras, event cameras provide a high dynamic range and low latency, offering greater robustness to rapid motion and challenging lighting conditions. Although the potential of event cameras for visual place recognition (VPR) has been established, developing robust VPR frameworks under severe illumination changes remains an open research problem. Here, we introduce an ensemble-based approach to event camera place recognition that combines sequence-matched results from multiple event-to- frame reconstructions, VPR feature extractors, and temporal resolutions. Unlike previous event-based ensemble methods, which only utilise temporal resolution, our broader fusion strategy delivers significantly improved robustness under varied lighting conditions (e.g., afternoon, sunset, night), achieving up to 77% relative improvement in Recall@1 across day-night transitions. We evaluate our approach on two long-term driving datasets (with 8 km per traverse) without metric subsampling, thereby preserving natural variations in speed and stop duration that influence event density. We also conduct a comprehensive analysis of key design choices, including binning strategies, reconstruction methods, and feature extractors, to identify the most critical components for robust performance. Additionally, we propose a modification to the standard sequence matching framework that enhances performance at longer sequence lengths. To facilitate future research, we release our codebase and benchmarking framework 1.

Index terms

Localization Computer Vision for Transportation