← Back ICRA 2026

Informative Object-Centric Next Best View for Object-Aware 3D Gaussian Splatting in Cluttered Scenes

Seung Hoon Jeong, Eunho Lee, Jeongyun Kim, Ayoung Kim

PDF

AI summary

Key figure (auto-extracted from paper)

Prioritizing underexplored, object-relevant regions via confidence-weighted information gain drastically cuts depth error and enables efficient targeted 3D reconstruction for robotic manipulation.

Next Best View 3D Gaussian Splatting Object-centric Reconstruction Robotic Manipulation Uncertainty Quantification Cluttered Scenes

Problem

Existing view selection methods for 3D reconstruction rely on geometric cues, neglect task-relevant semantics, and over-prioritize exploitation over exploration, making them inefficient for reconstructing specific occluded objects in cluttered scenes.

Approach

The method integrates instance segmentation masks into 3D Gaussian Splatting as one-hot object vectors, using confidence-weighted information gain to guide an object-centric Next Best View policy toward uncertain, task-relevant regions.

Key results

Reduces depth error by up to 77.14% on synthetic and 34.10% on real-world datasets
Achieves an additional 25.60% depth error reduction for target objects via object-centric view selection
Accelerates scene refinement with fewer optimization iterations while preserving reconstruction quality
Validated through successful real-world robotic grasp pose generation in cluttered environments

Why it matters

Enables robots to efficiently and accurately reconstruct specific target objects in cluttered environments, directly improving downstream manipulation and grasping performance.

Abstract

In cluttered scenes with inevitable occlusions and incomplete observations, selecting informative viewpoints is essential for building a reliable representation. In this context, 3D Gaussian Splatting (3DGS) offers a distinct advantage, as it can explicitly guide the selection of subsequent viewpoints and then refine the representation with new observations. However, existing approaches rely solely on geometric cues, neglect manipulation-relevant semantics, and tend to prioritize exploitation over exploration. To tackle these limitations, we introduce an instance-aware Next Best View (NBV) policy that prioritizes underexplored regions by leveraging object features. Specifically, our object-aware 3DGS distills instance- level information into one-hot object vectors, which are used to compute confidence-weighted information gain that guides the identification of regions associated with erroneous and uncertain Gaussians. Furthermore, our method can be easily adapted to an object-centric NBV, which focuses view selection on a target object, thereby improving reconstruction robustness to object placement. Experiments demonstrate that our NBV policy reduces depth error by up to 77.14% on the synthetic dataset and 34.10% on the real-world GraspNet dataset com- pared to baselines. Moreover, compared to targeting the entire scene, performing NBV on a specific object yields an additional reduction of 25.60% in depth error for that object. We further validate the effectiveness of our approach through real-world robotic manipulation tasks.

Index terms

Perception for Grasping and Manipulation Manipulation Planning Deep Learning in Grasping and Manipulation