Non-Rigid Structure-From-Motion Via Differential Geometry with Recoverable Conformal Scale
Yongbo Chen, Yanhao Zhang, Shaifali Parashar, Liang Zhao, Shoudong Huang
AI summary
Problem
Existing non-rigid structure-from-motion methods rely on restrictive locally planar and linear assumptions and cannot recover conformal scale, leading to inaccurate depth estimation and poor performance on complex deformations.
Approach
The authors introduce Con-NRSfM, which uses differential geometry to model conformal deformations, decouples depth and scale estimation, and solves the problem via a parallel separable iterative optimization algorithm combined with a self-supervised encoder-decoder network.
Key results
- Proves rotational invariance under conformal deformation to decouple scale and depth
- Relaxes locally planar and linear surface assumptions using second-order derivatives
- Introduces a parallel separable iterative optimization algorithm for robust recovery
- Outperforms state-of-the-art methods in accuracy and robustness on synthetic and real datasets
Why it matters
Enables precise 3D mapping of deformable environments, advancing monocular deformable SLAM and robotic navigation in medical and dynamic settings.
Abstract
Non-rigid structure-from-motion (NRSfM), a promising technique for addressing the mapping challenges in monocular visual deformable simultaneous localization and mapping (SLAM), has attracted growing attention. We introduce a novel method, called Con-NRSfM, for NRSfM under conformal deformations, encompassing isometric deformations as a subset. Our approach performs point-wise reconstruction using 2D selected image warps optimized through a graph- based framework. Unlike existing methods that rely on strict assumptions, such as locally planar surfaces or locally linear deformations, and fail to recover the conformal scale, our method eliminates these constraints and accurately computes the local conformal scale. Additionally, our framework decouples constraints on depth and conformal scale, which are inseparable in other approaches, enabling more precise depth estimation. To address the sensitivity of the formulated problem, we employ a parallel separable iterative optimization strategy. Furthermore, a self-supervised learning framework, utilizing an encoder-decoder network, is incorporated to generate dense 3D point clouds with texture. Simulation and experimental results using both synthetic and real datasets demonstrate that our method surpasses existing approaches in terms of reconstruction accuracy and robustness. The code for the proposed method will be made publicly available on the project website: https://sites.google.com/view/con-nrsfm.