← Back ICRA 2026

Efficient Event Camera Volume System

Juan Soto, Ian Noronha, Saru Bharti, Upinder Kaur

PDF

AI summary

Key figure (auto-extracted from paper)

EECVS enables artifact-free, real-time event compression by automatically selecting the optimal transform based on scene density, delivering superior reconstruction and cross-dataset generalization for robotic perception.

Event cameras adaptive compression Dirac impulse modeling real-time robotics dense event representation transform selection

Problem

Standard robotic pipelines require dense inputs, but existing event-to-dense methods either sacrifice fine temporal resolution through fixed binning or lack adaptability to varying event stream densities.

Approach

The framework models events as continuous-time Dirac impulses and autonomously switches between DCT, DTFT, and DWT transforms based on real-time density analysis, followed by tailored coefficient pruning to generate compact dense representations.

Key results

Density-driven transform selection eliminates temporal binning artifacts
DTFT achieves lowest Earth Mover Distance and highest reconstruction fidelity
DCT delivers 1.5 ms latency and 2.7× higher throughput than alternatives
Achieves mean IoU of 0.87 on MVSEC segmentation, vastly outperforming voxel grids

Why it matters

Enables reliable, real-time robotic perception in dynamic environments by efficiently bridging sparse event data with standard dense vision pipelines.

Abstract

Event cameras promise low latency and high dynamic range, yet their sparse output challenges integration into standard robotic pipelines. We introduce EECVS (Effi- cient Event Camera Volume System), a novel framework that models event streams as continuous-time Dirac impulse trains, enabling artifact-free compression through direct transform evaluation at event timestamps. Our key innovation combines density-driven adaptive selection among DCT, DTFT, and DWT transforms with transform-specific coefficient pruning strategies tailored to each domain’s sparsity characteristics. The framework eliminates temporal binning artifacts while auto- matically adapting compression strategies based on real-time event density analysis. On EHPT-XC and MVSEC datasets, our framework achieves superior reconstruction fidelity with DTFT delivering the lowest earth mover distance. In down- stream segmentation tasks, EECVS demonstrates robust gen- eralization. Notably, our approach demonstrates exceptional cross-dataset generalization: when evaluated with EventSAM segmentation, EECVS achieves mean IoU 0.87 on MVSEC versus 0.44 for voxel grids at 24 channels, while remaining competitive on EHPT-XC. Our ROS2 implementation provides real-time deployment with DCT processing achieving 1.5 ms latency and 2.7× higher throughput than alternative transforms, establishing the first adaptive event compression framework that maintains both computational efficiency and superior generalization across diverse robotic scenarios.

Index terms

Visual Learning Representation Learning Software Tools for Benchmarking and Reproducibility