← Back ICRA 2026

GPD-AP: A Grasp Pose-Driven Active Perception Framework for Occlusion-Robust Robotic Manipulation

Yancong Wei, Yunyi Pang, Sicheng Liu, Kangkang Dong, Houde Liu

PDF

AI summary

Key figure (auto-extracted from paper)

GPD-AP improves grasping success rates by 30% in dense obstacle environments by using grasp pose estimation to guide active viewpoint planning.

active perception robotic grasping occlusion robustness reinforcement learning grasp pose estimation next-best-view

Problem

Robotic grasping in cluttered, occluded environments is hindered by sensor limitations and inefficient generic exploration strategies that fail to prioritize actionable views for manipulation.

Approach

The framework fuses a lightweight grasp pose estimator with a reinforcement learning policy, using pose distributions and confidence scores to dynamically plan next-best views that resolve occlusions and optimize grasp execution.

Key results

30% increase in grasping success rates in dense obstacle environments
55% success rate in the hardest five-obstacle simulation scenario, outperforming AnyGrasp and GAMMA baselines
Novel reset module that systematically generates randomized, highly occluded training scenes
End-to-end integration of grasp pose filtering and fusion into RL observation space for dynamic viewpoint planning

Why it matters

Provides a scalable pathway for robots to autonomously navigate and manipulate objects in unstructured, cluttered real-world environments.

Abstract

Humans instinctively adjust their viewpoints to resolve occlusions and infer spatial relationships, enabling effec- tive perception and navigation in cluttered environments. This capability, however, remains a significant challenge for robotic systems. To address this, we propose GPD-AP, a novel active perception framework that leverages grasp pose estimation and associated scoring to systematically tackle grasping tasks in occluded and cluttered settings. The core innovation lies in an end-to-end system where a computationally efficient grasp pose estimation module directly informs a Next-Best-View (NBV) planner. This integration shifts the focus from generic scene exploration to a grasp-oriented visual search, guiding the robot to viewpoints that minimize uncertainty about potential grasps. To train and validate GPD-AP, we introduce a simulation reset method capable of generating highly challenging scenes with partially or fully occluded target objects. Experimental results demonstrate that GPD-AP improves grasping success rates by 30% in dense obstacle environments, effectively enabling the transition of target objects from invisible to visible and graspable states. This work marks a significant step towards autonomous and intelligent robotic manipulation in unstruc- tured real-world scenarios.

Index terms

Grasping Reinforcement Learning Perception for Grasping and Manipulation