Continuous-Time Optical Flow Estimation from Asynchronous Event-Frame Streams for Embedded Systems
Daolong Yang, Hansheng Liang, Liu Haoyuan, Chengcai Wang, Bin Xu, Kun Xu, Xilun DING
AI summary
Problem
Existing hybrid event-frame optical flow methods struggle with real-time embedded deployment due to difficulty handling asynchronous temporal offsets between sensors and high computational latency from dense correlation volumes and iterative refinement.
Approach
The authors propose a lightweight architecture that uses intra- and inter-sensor temporal tracking modules to encode asynchronous inputs, then fuses them into a motion-compensated correlation volume for dense flow estimation in a single forward pass.
Key results
- 22% accuracy improvement over state-of-the-art hybrid methods
- 3× faster inference on NVIDIA Jetson Xavier NX embedded GPU
- Robust generalization across diverse frame-event temporal offsets
- Single-pass dense flow estimation via lightweight correlation volume
Why it matters
Enables real-time, high-accuracy optical flow estimation for resource-constrained robotic platforms operating in dynamic environments.
Abstract
Bioinspired event cameras, with their high tem- poral resolution, low power consumption, and inherent motion responsiveness, have been widely adopted for fundamental vision tasks in robotics, notably optical flow estimation. Recent studies have shown that incorporating complementary frame data can significantly enhance the performance of event-based optical flow estimation. However, two major challenges hinder the real-time deployment of such methods on robotic platforms: (1) the asynchronous nature of events and frames makes it difficult to generalize across varying input temporal offsets; and (2) reliance on computationally expensive correlation volume construction and iterative refinement results in high inference latency on embedded systems. To address these issues, we pro- pose a novel method that takes asynchronous event and frame streams as input and predicts high-quality dense flow in a single forward pass. Our approach temporally encodes both intra- and inter-sensor features and efficiently integrates them into a lightweight correlation volume to enhance flow prediction. Experimental results on real-world scenes demonstrate that our method improves flow accuracy by up to 22% over state- of-the-art hybrid event-frame methods, while being 3× faster on embedded GPUs. Furthermore, our approach maintains strong performance and generalizes well across diverse frame- event temporal offsets, introducing a novel paradigm for fusing asynchronous frame and event streams for continuous-time optical flow estimation.