Keypoint-Based Dynamic Object 6-DoF Pose Tracking Via Event Camera
Zhe Wang, Qijin Song, Zihao Li, Jingyu Xiao, Weibang Bai
AI summary
Problem
Conventional cameras suffer from motion blur and low-light limitations when tracking fast-moving objects, while existing event-based pose methods struggle with curved geometries and require fixed initial poses.
Approach
The method detects object keypoints from event time surfaces using a lightweight neural network, tracks them via polarity-adaptive event density matching and an Extended Kalman Filter, and computes 6-DoF pose through 2D-3D correspondence.
Key results
- Lightweight neural network for robust keypoint detection in sparse event data
- Polarity-aware event density tracking algorithm with Extended Kalman Filter for drift reduction
- Structure-aware loss function ensuring geometric consistency and precise localization
- Superior accuracy and robustness over state-of-the-art methods in both simulated and real high-speed motion tests
Why it matters
Enables reliable robotic manipulation and assembly of fast-moving, curved objects in challenging environments where traditional vision fails.
Abstract
Accurate 6-DoF pose estimation of objects is crit- ical for robots to perform precise manipulation tasks. However, for dynamic object pose estimation, conventional camera-based approaches face several major challenges, such as motion blur, sensor noise, and low-light limitation. To address these issues, we employ event cameras, whose high dynamic range and low latency offer a promising solution. Furthermore, we propose a keypoint-based detection and tracking approach for dynamic object pose estimation. Firstly, a keypoint detection network is constructed to extract keypoints from the time surface generated by the event stream. Subsequently, the polarity and spatial coordinates of the events are leveraged, and the event density in the vicinity of each keypoint is utilized to achieve continuous keypoint tracking. Finally, a hash mapping is established between the 2D keypoints and the 3D model keypoints, and the EPnP algorithm is employed to estimate the 6-DoF pose. Experimental results demonstrate that, whether in simulated or real event environments, the proposed method outperforms the event-based state-of-the-art methods in terms of both accuracy and robustness.