Grasp Like Humans: Learning Generalizable Multi-Fingered Grasping from Human Proprioceptive Sensorimotor Integration
Ce Guo, Xieyuanli Chen, Zhiwen Zeng, Zirui Guo, YiHong Li, Haoran Xiao, Dewen Hu, Huimin Lu
AI summary
Problem
Robotic hands struggle with reliable finger coordination and contact force management for stable grasping, particularly with deformable or irregular objects, while existing imitation learning methods often rely on unintuitive teleoperation or device-specific demonstrations that hinder cross-platform generalization.
Approach
The authors use a wearable data glove to capture natural human tactile and kinesthetic feedback during grasping, encode it into a unified graph structure, and train a spatio-temporal graph network to predict joint states and contact forces that are mapped to robotic hand commands.
Key results
- Unified graph representation encoding joint motions and contact forces with polar coordinates
- TK-STGN network leveraging subgraph convolutions and attention-based LSTM for spatio-temporal prediction
- Force-position hybrid mapping enabling cross-platform generalization across six robotic hand configurations
- Superior grasp success rate, finger coordination, and force management compared to baselines on diverse and deformable objects
Why it matters
Provides a scalable, vision-free pathway for robotic hands to achieve human-like dexterous manipulation, advancing real-world applications in unstructured environments.
Abstract
Tactile and kinesthetic perceptions are crucial for hu- man dexterous manipulation, enabling reliable grasping of objects via proprioceptive sensorimotor integration. For robotic hands, even though acquiring such tactile and kinesthetic feedback is feasible, establishing a direct mapping from this sensory feedback to motor actions remains challenging. In this article, we propose a novel glove-mediated tactile–kinematic perception–prediction framework for grasp skill transfer from human intuitive and natu- ral operation to robotic execution based on imitation learning, and its effectiveness is validated through generalized grasping tasks, including those involving deformable objects. First, we integrate a data glove to capture tactile and kinesthetic data at the joint level. The glove is adaptable for both human and robotic hands, allowing data collection from natural human hand demonstra- tions across different scenarios. It ensures consistency in the raw data format, enabling evaluation of grasping for both human and robotic hands. Second, we establish a unified representation of multimodal inputs based on graph structures with polar coordi- nates. We explicitly integrate the morphological differences into the designed representation, enhancing the compatibility across different demonstrators and robotic hands. Furthermore, we intro- ducethetactile–kinestheticspatio-temporalgraphnetworks,which leverage multidimensional subgraph convolutions and attention- based long short-term memory (LSTM) layers to extract spatio- temporal features from graph inputs to predict node-based states for each hand joint. These predictions are then mapped to final commands through a force-position hybrid mapping. Comparative experiments and ablation studies demonstrate that our approach surpasses other methods in grasp success rate, finger coordination, contact force management, and both grasp and computational efficiency, achieving results most akin to human grasping. The robustness of our approach is also validated through multiple Received 9 July 2025; accepted 2 September 2025. Date of publication 23 September 2025; date of current version 8 October 2025. This work was supported in part by the National Science Foundation of China under Grant U22A2059, 62203460, Grant 62403478, and Grant T2521006, in part by the Young Elite Scientists Sponsorship Program by CAST under Grant 2023QNRC001, and in part by the Innovation Research Foundation of National University of Defense Technology.This article was recommended for publication by Associate Editor N. Correll and Editor J.Bohg upon evaluation of the reviewers’ comments. (Ce Guo and Xieyuanli Chen contributed equally t o this work.) (Corresponding authors: Dewen Hu; Huimin Lu.) The authors are with the College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China (e-mail: dwhu@nudt.edu.cn; lhmnew@nudt.edu.cn). The dataset and additional supporting videos are accessible via https:// grasplikehuman.github.io/. This article has supplementary downloadable material available at https://doi.org/10.1109/TRO.2025.3613541, provided by the authors. Digital Object Identifier 10.1109/TRO.2025.3613541 randomized experimental setups, and its generalization capability is tested across diverse objects and robotic hands.