Active-Perceptive Language-Oriented Grasp Policy for Heavily Cluttered Scenes
Yixiang Dai, Siang Chen, Kaiqin Yang, Dingchang Hu, Pengwei Xie, Guosheng Li, Yuan Shen, Guijin Wang
AI summary
Problem
Language-guided robotic grasping in heavily cluttered environments fails due to severe occlusions and perceptual ambiguity. Existing methods rely on static viewpoints or inefficient pre-scanning paradigms that cannot sequentially uncover buried targets.
Approach
The proposed APeG framework dynamically interleaves grasp attempts with active viewpoint planning based on joint occlusion-semantic information gain, while using a grasp-wise reinforcement learning policy to select robust grasp poses.
Key results
- Proposes APeG, a closed-loop active-perceptive framework for language-oriented grasping
- Introduces an occlusion-aware, semantic-guided viewpoint optimization strategy
- Develops a grasp-wise reinforcement learning policy for robust grasp selection
- Demonstrates significant improvements in task success rate and efficiency over baselines in simulation and real-world tests
Why it matters
Advances practical language-conditioned robotic manipulation by enabling reliable object retrieval in complex, occluded environments where traditional methods fail.
Abstract
Language-guided robotic grasping in cluttered envi- ronments presents significant challenges due to severe occlusions and complex scene structures, which often hinder accurate target localization. Existing approaches typically suffer from limited observational capabilities, resulting in suboptimal exploration of the target object. In this paper, we propose a novel Active- Perceptive Language-Oriented Grasp Policy (APeG) for heavily cluttered scenes. APeG develops an active perception scheme in the grasp pipeline via an occlusion-aware, semantic-guided viewpoint optimization strategy, enabling efficient exploration of cluttered scenes. In addition, a grasp-wise Reinforcement Learning (RL) policy is proposed to select robust grasp poses. Extensive real-world experiments validate the effectiveness of APeG, demonstrating significant improvements in both task success rate and operational efficiency over existing baselines, highlighting its potential for practical deployment in language- conditioned robotic manipulation.