Color-Pair Guided Robust Zero-Shot 6D Pose Estimation and Tracking of Cluttered Objects on Edge Devices
Xingjian Yang, Ashis Banerjee
AI summary
Problem
Robust 6D pose estimation for novel textured objects under challenging lighting remains difficult, often forcing a trade-off between accurate initial localization and efficient real-time tracking. Existing zero-shot methods rely on computationally heavy template matching or foundation models that introduce latency and struggle with drift and occlusions on edge devices.
Approach
The authors introduce a unified pipeline that leverages a novel, illumination-robust color-pair feature representation. This feature enables fast initial pose registration via geometric hashing and ICP, followed by a lightweight optical flow and depth-based tracker that uses the same feature logic for temporal correspondence filtering.
Key results
- Novel lighting-invariant color-pair feature descriptor for robust correspondence matching
- Efficient initial pose estimation using Hough voting and weighted ICP on classified edge point clouds
- Lightweight tracking module combining optical flow, depth cues, and a viewpoint-invariant rotation estimator
- Competitive accuracy and high-fidelity tracking on benchmarks, even through abrupt pose changes and challenging illumination
Why it matters
Enables real-time, robust 6D pose tracking for novel objects on resource-constrained edge devices, advancing autonomous manipulation and AR/VR applications.
Abstract
Robust 6D pose estimation of novel textured objects under challenging illumination remains a significant challenge, often requiring a trade-off between accurate initial pose estimation and efficient real-time tracking. We present a unified framework explicitly designed for efficient execution on edge devices, which synergizes a robust initial estimation module with a fast motion-based tracker. The key to our approach is a shared, lighting-invariant color-pair feature rep- resentation that forms a consistent foundation for both stages. For initial estimation, this feature facilitates robust registration between the live RGB-D view and the object’s 3D mesh. For tracking, the same feature logic validates temporal correspon- dences, enabling a lightweight model to reliably regress the object’s motion. Extensive experiments on benchmark datasets demonstrate that our integrated approach is both effective and robust, providing competitive pose estimation accuracy while maintaining high-fidelity tracking even through abrupt pose changes. Code: https://github.com/smartslab/ Color-Pair-Guided-Zero-Shot-6D-Pose