GPD-AP: A Grasp Pose-Driven Active Perception Framework for Occlusion-Robust Robotic Manipulation
Yancong Wei, Yunyi Pang, Sicheng Liu, Kangkang Dong, Houde Liu
AI summary
Problem
Robotic grasping in cluttered, occluded environments is hindered by sensor limitations and inefficient generic exploration strategies that fail to prioritize actionable views for manipulation.
Approach
The framework fuses a lightweight grasp pose estimator with a reinforcement learning policy, using pose distributions and confidence scores to dynamically plan next-best views that resolve occlusions and optimize grasp execution.
Key results
- 30% increase in grasping success rates in dense obstacle environments
- 55% success rate in the hardest five-obstacle simulation scenario, outperforming AnyGrasp and GAMMA baselines
- Novel reset module that systematically generates randomized, highly occluded training scenes
- End-to-end integration of grasp pose filtering and fusion into RL observation space for dynamic viewpoint planning
Why it matters
Provides a scalable pathway for robots to autonomously navigate and manipulate objects in unstructured, cluttered real-world environments.
Abstract
Humans instinctively adjust their viewpoints to resolve occlusions and infer spatial relationships, enabling effec- tive perception and navigation in cluttered environments. This capability, however, remains a significant challenge for robotic systems. To address this, we propose GPD-AP, a novel active perception framework that leverages grasp pose estimation and associated scoring to systematically tackle grasping tasks in occluded and cluttered settings. The core innovation lies in an end-to-end system where a computationally efficient grasp pose estimation module directly informs a Next-Best-View (NBV) planner. This integration shifts the focus from generic scene exploration to a grasp-oriented visual search, guiding the robot to viewpoints that minimize uncertainty about potential grasps. To train and validate GPD-AP, we introduce a simulation reset method capable of generating highly challenging scenes with partially or fully occluded target objects. Experimental results demonstrate that GPD-AP improves grasping success rates by 30% in dense obstacle environments, effectively enabling the transition of target objects from invisible to visible and graspable states. This work marks a significant step towards autonomous and intelligent robotic manipulation in unstruc- tured real-world scenarios.