PG-Match: A Pose-Guided Generalizable Framework for Semi-Dense Feature Matching
Jiayi Pei, Peili Song, Chenyang Zhao, Lei Sun, Jingtai Liu
AI summary
Problem
Existing detector-free feature matching methods rely on ground-truth depth data for supervision, which is scarce and limits generalization across diverse environments. Furthermore, traditional match supervision lacks global geometric consistency, reducing inlier ratios and degrading downstream performance.
Approach
PG-Match leverages ground-truth camera poses as supervision instead of depth, enabling end-to-end training via a Differentiable Outlier Rejection Module (DORM). It combines this with a confidence-guided coarse-to-fine matching strategy to efficiently refine semi-dense correspondences while maintaining global consistency.
Key results
- Outperforms state-of-the-art pose accuracy on MegaDepth-1500
- Demonstrates strong cross-dataset generalization on PhotoTourism
- Improves accuracy and completeness in downstream SfM pipelines
- Increases inlier ratios through differentiable outlier rejection
Why it matters
Enables reliable, depth-independent feature matching for real-world 3D reconstruction and visual localization where ground-truth depth is unavailable.
Abstract
Feature matching is a fundamental technique in visual perception, essential for tasks such as 3D reconstruction, SLAM, and visual localization. Existing detector-free methods often struggle to generalize due to their reliance on depth data, which is not available in many datasets. We propose PG-Match, a detector-free feature matching framework that leverages pose supervision instead of depth-based supervision, thereby im- proving generalization across diverse environments. We further introduce a Differentiable Outlier Rejection Module (DORM) to enhance global consistency and increase the inlier ratio. For efficiency, a coarse-to-fine matching strategy is employed, where specially designed confidence scores are utilized to guide the sampling process. This ensures efficient convergence and avoids local optima. Experiments on the widely used MegaDepth- 1500 dataset show that PG-Match consistently outperforms state-of-the-art approaches, highlighting the effectiveness of its pose-guided design. Additionally, experiments on the depth-free PhotoTourism dataset further evaluate generalization of PG- Match, and its performance is also assessed in a downstream Structure from Motion (SfM) task.