One-Shot View Planning and Online Optimization-Based Replanning for Unknown Object Reconstruction
José Johil Patiño Miñán, Zachary Kingston, Victor Romero-Cano, Yu-Kun Lai, Juan D. Hernández
AI summary
Problem
Traditional next-best view planning lacks global trajectory optimality, while existing one-shot view planning methods rely on inaccurate geometric priors that often produce infeasible views and inefficient paths when inspecting unknown objects.
Approach
The framework generates a geometric prior from an initial RGB-D image to plan globally optimal views, then dynamically replans the robot's path using online video-based reconstruction to satisfy visibility constraints, maintain smoothness, and avoid collisions.
Key results
- Novel OSVP framework combining RGB-D priors with online video-based reconstruction
- Optimization-based path replanning that dynamically satisfies visibility and smoothness constraints
- Simulation benchmarks showing superior reconstruction quality and efficiency over state-of-the-art OSVP methods
- Real-world validation on a Franka Emika manipulator confirming practical feasibility
Why it matters
Enables robots to efficiently and accurately reconstruct unknown objects in real-time, advancing autonomous inspection for quality control, cultural heritage, and manufacturing.
Abstract
Robotic inspection tasks often require constructing high-quality 3D models of objects from a minimal number of views. Traditional next-best view planning (NBVP) approaches incrementally select view poses but fail to account for global optimality of the inspection trajectory, thus leading to inefficient inspection paths. Recent one-shot view planning (OSVP) methods address this challenge by predicting informative view poses from an initial observation. While subsequent improvements on the pioneering OSVP approach attempt to improve prediction accuracy, they can still fail when faced with out of distribution (OoD) examples. With recent advances in generative modeling, OSVP methods can infer a plausible object shape from one observation and then derive the corresponding solution set of view poses. However, because the predicted shape may deviate from the true geometry, these methods can still generate infeasible views. To overcome these limitations, we propose a novel OSVP framework that leverages RGB-D data to generate geometric priors and incorporates online video-based reconstruction. Our method formulates viewpoint selection and path optimization, so that both the calculated poses and the connecting trajectories satisfy visibility constraints, maintain smoothness, and can be locally replanned to compensate for discrepancies between predicted and real object geometries. We validate our OSVP approach through simulation benchmarks against state-of-the-art OSVP techniques and demonstrate its effectiveness on a real Franka Emika manipulator.