Research Analyzer
← Back IROS 2024

Fine-Tuning the Diffusion Model and Distilling Informative Priors for Sparse-View 3D Reconstruction

Jiadong Tang, Yu Gao, Tianji Jiang, Yi Yang, Mengyin Fu

PDF

Abstract

3D reconstruction methods such as Neural Ra- diance Fields (NeRFs) are capable of optimizing high-quality 3D representation from images. However, NeRF is limited by the requirement for a large number of multi-view images, making its application to real-world scenarios challenging. In this work, we propose a method that can reconstruct real- world scenes from a few input images and a simple text prompt. Specifically, we fine-tune a pretrained diffusion model to constrain its powerful priors to the visual inputs and generate 3D-aware images, leveraging the coarse renderings obtained from input images as the image condition, along with the text prompt as the text condition. Our fine-tuning method saves a significant amount of training time and GPU memory usage while also generating credible results. Moreover, to enable our method to have self-evaluation capabilities, we design a semantic switch to filter out generated images that do not match real scenes, ensuring that only informative priors from the fine-tuned diffusion model are distilled into the 3D model. The semantic switch we designed can be used as a plug-in and improve performance by 13%. We perform our approach on a real-world dataset and demonstrate competitive results compared to existing sparse-view 3D reconstruction methods. Please see our project page for more visualizations and code: https://bityia.github.io/FDfusion. *This work was partly supported by National Key R&D Program of China (2022YFC2603600) and National Natural Science Foundation of China (Grant No. NSFC 62233002) 1School of Automation, Beijing Institute of Technology, Beijing, China 2National Key Lab of Autonomous Intelligent Unmanned Systems, Beijing Institute of Technology, Beijing, China *Corresponding author: Y. Yang Email: yang yi@bit.edu.cn

Index terms

Deep Learning for Visual Perception Deep Learning Methods AI-Based Methods