Research Analyzer
← Back ICRA 2026

PIPS: Planar Instance 3D Reconstruction Leveraging Planar Structural Priors

Jiahui Wang, Ye Chen, Yinan Deng, Yi Yang, Yufeng Yue

PDF

AI summary

Key figure (auto-extracted from paper)
PIPS achieves semantically and geometrically aligned 3D planar reconstruction by leveraging structural priors for annotation-free segmentation and multi-view instance association.
Planar reconstruction 3D instance segmentation multi-view stereo structural priors semantic-geometry alignment robotics

Problem

Existing multi-view planar reconstruction methods lack explicit planar instance definitions, causing misaligned semantics and distorted geometry, while heavily relying on annotations or per-scene feature optimization.

Approach

The method uses a structure-guided segmentor to extract single-view planar masks from monocular cues, then associates them across views via a normal-guided mask graph to form consistent 3D instance point clouds, which are regularized and meshed into final planar surfaces.

Key results

  • Annotation-free single-view segmentation using monocular structural priors
  • Multi-view instance association via normal-guided mask graph clustering
  • Instance-level planar meshing with distance normalization for geometric regularization
  • State-of-the-art accuracy on ScanNetV2 and ScanNet++ without per-scene optimization

Why it matters

Provides a lightweight, annotation-efficient pipeline for accurate 3D scene abstraction critical for robotics navigation and virtual reality applications.

Abstract

Planar structures, ubiquitous in man-made indoor environments, enable compact and accurate scene abstraction for various downstream tasks. Recent methods distill planar features into learning-based MVS geometries to obtain coherent 3D plane estimation from multi-view inputs. However, the lack of explicit planar instance definitions hinders seman- tic–geometry alignment, leading to distorted geometry and mismatched semantics. To address this, we propose PIPS, a planar-instance 3D reconstruction method that leverages planar structural priors for both single-view planar segmentation (SGPS module) and multi-view instance association (MVPI module). The planar instance point clouds are regularized by planar distances and then converted into complete planar meshes via an instance-level planar meshing strategy. Extensive experiments on hundreds of indoor scenes demonstrate the superior performance of our method, which is less dependent on annotations and requires no feature optimization. The effectiveness of each component is further verified through comprehensive ablation studies. The project page of PIPS is available at https://pips325.github.io.

Index terms

Mapping

Related papers