DKPMV: Dense Keypoints Fusion from Multi-View RGB Frames for 6D Pose Estimation of Textureless Objects
Jiahong Chen, JingHao Wang, Zi Wang, Ziwen Wang, Banglei Guan, Qifeng Yu
AI summary
Problem
6D pose estimation for textureless objects is hindered by unreliable depth data on reflective surfaces and the scale/occlusion limitations of single-view RGB methods. Existing multi-view approaches either depend on degraded depth inputs or fail to fully exploit cross-view geometric consistency at the keypoint level.
Approach
DKPMV predicts dense keypoints from multiple RGB views, enhances them with attentional aggregation and symmetry-aware training, and fuses them via a three-stage progressive optimization pipeline to recover accurate 6D poses.
Key results
- Dense keypoint-level fusion using only multi-view RGB inputs
- Symmetry-aware training resolves pose ambiguities on symmetric objects
- Attentional aggregation improves keypoint localization and fusion stability
- Surpasses state-of-the-art RGB and RGB-D methods on the ROBI dataset
Why it matters
Provides a robust, cost-effective solution for real-time industrial robotic perception where depth sensors fail or are impractical.
Abstract
6D pose estimation of textureless objects is valu- able for industrial robotic applications, yet remains challenging due to the frequent loss of depth information. Current multi- view methods either rely on depth data or insufficiently exploit multi-view geometric cues, limiting their performance. In this paper, we propose DKPMV, a pipeline that achieves dense keypoint-level fusion using only multi-view RGB images as in- put. We design a three-stage progressive pose optimization strat- egy that leverages dense multi-view keypoint geometry informa- tion. To enable effective dense keypoint fusion, we enhance the keypoint network with attentional aggregation and symmetry- aware training, improving prediction accuracy and resolving ambiguities on symmetric objects. Extensive experiments on the ROBI dataset demonstrate that DKPMV outperforms state-of- the-art multi-view RGB and RGB-D approaches. The code will be available at https://github.com/chenjiahongbq/DKPMV.