← Back ICRA 2026

Robust SG-NeRF: Robust Scene Graph Aided Neural Surface Reconstruction

Yi Gu,∗, Dongjun Ye,∗, Zhaorui Wang,∗, Jiaxu Wang, Jiahang Cao, Mingle Zhao, Renjing Xu

PDF

AI summary

Key figure (auto-extracted from paper)

A detached color network reliably distinguishes inlier from outlier camera poses, enabling robust 3D reconstruction even with severely noisy initial poses.

Neural surface reconstruction camera pose optimization scene graph outlier detection Monte Carlo re-localization NeRF

Problem

Existing neural surface reconstruction methods struggle to correct large camera pose errors (outliers) from Structure from Motion pipelines, often degrading geometry due to shape-radiance ambiguities.

Approach

The method uses a detached color network to estimate pose confidence, separating inliers from outliers, then applies Monte Carlo re-localization for outliers and re-projection/IoU losses for inliers, guided by an updated scene graph.

Key results

Plug-and-play confidence estimation identifies inlier and outlier poses
Monte Carlo re-localization corrects severely noisy and mirrored poses
Re-projection and IoU losses enhance geometric consistency
Consistently improves reconstruction quality and pose accuracy on benchmark datasets

Why it matters

Enables reliable 3D reconstruction from unstructured, noisy image collections without requiring manual pose correction or high-quality initial SfM outputs.

Abstract

Neural surface reconstruction relies heavily on accurate camera poses as input. Despite utilizing advanced pose estimators like COLMAP or ARKit, camera poses can still be noisy. Existing pose-NeRF joint optimization methods handle poses with small noise (inliers) effectively but struggle with large noise (outliers), such as mirrored poses. In this work, we focus on mitigating the impact of outlier poses. Our method integrates an inlier-outlier confidence estimation scheme, leveraging scene graph information gathered during the data preparation phase. Unlike previous works directly using rendering metrics as the reference, we employ a detached color network that omits the viewing direction as input to minimize the impact caused by shape-radiance ambiguities. This enhanced confidence updating strategy effectively differentiates between inlier and outlier poses, allowing us to sample more rays from inlier poses to construct more reliable radiance fields. Additionally, we introduce a re-projection loss based on the current Signed Distance Function (SDF) and pose estimations, strengthening the constraints between matching image pairs. For outlier poses, we adopt a Monte Carlo re-localization method to find better solutions. We also devise a scene graph updating strategy to provide more accurate information throughout the training process. We validate our approach on the SG-NeRF and DTU datasets. Experimental results on various datasets demonstrate that our methods can consistently improve the reconstruction qualities and pose accuracies. Project page: https://rsg-nerf.github.io/RSG-NeRF/.

Index terms

Deep Learning for Visual Perception Visual Learning Computer Vision for Automation