Efficient Construction of Implicit Surface Models from a Single Image for Motion Generation
Wei-Teng Chu, Tianyi Zhang, Matthew Johnson-Roberson, Weiming Zhi
AI summary
Problem
Existing neural implicit surface methods require dense multi-view images and lengthy training times, making them impractical for real-time robotics applications with sparse observations.
Approach
FINS uses pre-trained 3D foundation models to generate point cloud supervision from a single image, paired with a multi-resolution hash grid encoder and a staged hybrid optimizer for rapid convergence.
Key results
- Achieves high-precision SDF training from a single image in ~10 seconds
- Leverages 3D foundation models for effective single-view supervision
- Outperforms state-of-the-art baselines in convergence speed and reconstruction accuracy
- Demonstrates successful robot surface following and scalability across benchmarks
Why it matters
Enables real-time, sparse-view 3D reconstruction for downstream robotics tasks like obstacle avoidance, path planning, and surface inspection.
Abstract
Implicit representations have been widely applied in robotics for obstacle avoidance and path planning. In this paper, we explore the problem of constructing an implicit distance representation from a single image. Past methods for implicit surface reconstruction, such as NeuS and its variants generally require a large set of multi-view images as input, and require long training times. In this work, we propose Fast Image-to-Neural Surface (FINS), a lightweight framework that can reconstruct high-fidelity surfaces and SDF fields based on a single or a small set of images. FINS integrates a multi- resolution hash grid encoder with lightweight geometry and color heads, making the training via an approximate second- order optimizer highly efficient and capable of converging within a few seconds. Additionally, we achieve the construction of a neural surface requiring only a single RGB image, by leveraging pre-trained foundation models to estimate the geometry inherent in the image. Our experiments demonstrate that under the same conditions, our method outperforms state- of-the-art baselines in both convergence speed and accuracy on surface reconstruction and SDF field estimation. Moreover, we demonstrate the applicability of FINS for robot surface follow- ing tasks and show its scalability to a variety of benchmark datasets. Code is publicly available at https://github. com/waynechu1109/FINS