← Back ICRA 2026

Robust Bayesian Scene Reconstruction with Retrieval-Augmented Priors for Precise Grasping and Planning

Herbert Wright, Weiming Zhi, Martin Matak, Matthew Johnson-Roberson, Tucker Hermans

PDF

AI summary

Key figure (auto-extracted from paper)

BRRP leverages retrieved mesh priors within a Bayesian framework to robustly reconstruct occluded 3D scenes and quantify uncertainty, enabling more reliable robotic grasping.

Bayesian reconstruction retrieval-augmented priors 3D scene understanding robotic grasping uncertainty quantification Hilbert maps

Problem

Building accurate 3D representations from noisy, partial single-view RGBD data is essential for robotic manipulation but hindered by occlusions and unknown objects. Current deep learning approaches lack robustness and calibrated uncertainty, while non-learning methods cannot infer unobserved geometry.

Approach

The method uses a foundation model to retrieve relevant shape priors from a mesh database, then combines them with observed depth data via Stein Variational Gradient Descent to infer a posterior distribution over object shapes.

Key results

Accurate multi-object 3D reconstructions from single RGBD views
Robustness to noisy real-world data and out-of-distribution objects
Principled uncertainty quantification for occluded geometry
Improved real-world grasping success in cluttered scenes

Why it matters

Enables robots to safely and accurately manipulate objects in unstructured environments by providing reliable geometric understanding and uncertainty estimates.

Abstract

Constructing 3D representations of object geometry is critical for many robotics tasks, particularly manipulation problems. These representations must be built from potentially noisy partial observations. In this work, we focus on the problem of reconstructing a multi-object scene from a single RGBD image using a fixed camera. Traditional scene representation methods generally cannot infer the geometry of unobserved regions of the objects in the image. Attempts have been made to leverage deep learning to train on a dataset of known objects and rep- resentations, and then generalize to new observations. However, this can be brittle to noisy real-world observations and objects not contained in the dataset, and do not provide well-calibrated reconstruction confidences. We propose BRRP, a reconstruction method that leverages preexisting mesh datasets to build an informative prior during robust probabilistic reconstruction. We introduce the concept of a retrieval-augmented prior, where we retrieve relevant components of our prior distribution from a database of objects during inference. The resulting prior enables estimation of the geometry of occluded portions of the in-scene objects. Our method produces a distribution over object shape that can be used for reconstruction and measuring uncertainty. We evaluate our method in both simulated scenes and in the real world. We demonstrate the robustness of our method against deep learning-only approaches while being more accurate than a method without an informative prior. Through real-world experiments, we particularly highlight the capability of BRRP to enable successful dexterous manipulation in clutter.

Index terms

Perception for Grasping and Manipulation Probabilistic Inference