Research Analyzer
← Back IROS 2024

RelationGrasp: Object-Oriented Prompt Learning for Simultaneously Grasp Detection and Manipulation Relationship in Open Vocabulary

Songting Liu, Tat Joo Teo, Zhiping Lin, Haiyue Zhu

PDF

Abstract

Autonomous robotic grasping under complex, clustered, and unstructured environments is a fundamental but challenging task. To achieve human-like rationality in dealing with the grasping task, the agent requires hybrid intelligence from multilateral aspects. This paper introduces RelationGrasp, a unified framework employing a transformer encoder-decoder structure to simultaneously achieve open-vocabulary object detection, manipulation relationship inference, and grasp pose detection. A unique object-oriented prompt learning mecha- nism is designed to seamlessly bridge the grasp pose and manipulation relationship branches, delivering high fidelity of object-grasp affiliation for object-aware grasping and grasp sequence planning. By formulating the relationship detection as an adjacency matrix regression task under multi-task learning, our framework significantly increases the relationship accuracy with reduced computational overhead. Moreover, to facilitate the robust and adaptive deployment of the proposed Relation- Grasp to novel environments, we propose a consistency-based self-supervised adaptation strategy to adapt the pre-trained network to new scenarios and improve grasp accuracy on unseen objects. Our proposed network achieved state-of-the-art performance on various public dataset such as VMRD, OCID, etc., in both grasp detection and manipulation relationship classification, and real-world robot experiments has also been conducted to show the practical usages.

Index terms

Deep Learning in Grasping and Manipulation Perception for Grasping and Manipulation Grasping