Research Analyzer
← Back ICRA 2026

GFreeDet2: Exploiting Gaussian Splatting and Foundation Models for RGB-Based Model-Free 2D and 6D Detection of Unseen Objects

Gu Wang, Xingyu Liu, Jingyi Tang, Chengxi Li, Yingyue Li, Ziqin Huang, Xiangyang Ji

PDF

AI summary

Key figure (auto-extracted from paper)
GFreeDet2 achieves state-of-the-art model-free 2D and 6D detection of unseen objects using Gaussian Splatting and foundation models, unifying pinhole and fisheye camera inputs without CAD models.
Gaussian Splatting Model-free Detection Unseen Objects 6D Pose Estimation Foundation Models Fisheye Cameras

Problem

Existing unseen object detection methods rely on costly CAD models, require depth sensors, or are limited to specific camera types, hindering scalability in open-world robotics and mixed reality.

Approach

The method reconstructs 3D Gaussian object models from multi-view RGB references using projection-aware perspective cropping to handle both pinhole and fisheye cameras, then integrates these models into foundation model-driven detection pipelines.

Key results

  • First complete model-free RGB-based 2D and 6D detection on the BOP-H3 benchmark
  • State-of-the-art performance across pinhole and fisheye datasets
  • Minimal-modification integration of Gaussian models into foundation model pipelines
  • Unified pinhole and fisheye handling via projection-aware perspective cropping

Why it matters

It provides a scalable, cost-effective alternative to CAD-dependent methods, enabling robust object detection in open-world robotics and mixed reality applications using only RGB images.

Abstract

We introduce GFreeDet2, which leverages Gaussian Splatting and foundation models to address RGB- based model-free 2D detection and 6D detection of unseen objects. GFreeDet2 reconstructs 3D Gaussian object models from multi-view RGB references, enabling efficient model-free detection without relying on CAD models. To accelerate reconstruction and consistently handle both pinhole and fisheye cameras, we propose projection-aware perspective cropping (PAPC) with visual hull initialization. PAPC further improves coarse 6D detection by accurately extracting pinhole crops from fisheye query images. The Gaussian objects enable rendering in place of CAD models within foundation model- driven pipelines, allowing existing state-of-the-art RGB-based methods for unseen 2D and 6D detection to be extended to the model-free setting with minimal modifications. Extensive experiments on all three BOP-H3 datasets demonstrate that GFreeDet2 achieves state-of-the-art performance and establishes a strong baseline for RGB-based, model-free 2D and 6D unseen object detection. The code is publicly available at github.com/wangg12/GFreeDet2.git.

Index terms

Computer Vision for Automation Perception for Grasping and Manipulation Omnidirectional Vision

Related papers