OmniDexGrasp: Generalizable Dexterous Grasping Via Foundation Model and Force Feedback
Yi-Lin Wei, Zhexi Luo, Yuhao Lin, Mu Lin, Zhizhao Liang, Shuoyu Chen, Wei-Shi Zheng
AI summary
Problem
Existing dexterous grasping methods struggle to generalize across diverse objects, tasks, and robot embodiments due to limited semantic grasp datasets and the gap between high-level foundation model knowledge and low-level physical robot execution constraints.
Approach
The framework uses foundation models to generate human grasp images from diverse user prompts, transfers these images to executable dexterous robot actions via pose retargeting, and applies a force-aware adaptive control strategy to ensure stable, physically plausible grasps.
Key results
- Achieves omni-capable dexterous grasping across diverse prompts, objects, and robot embodiments
- Introduces a learning-free human-image-to-robot-action transfer strategy for executable grasp generation
- Implements force-aware adaptive control for stable and safe physical grasp execution
- Demonstrates successful extension to dexterous manipulation and cross-robot generalization in simulation and real-world tests
Why it matters
Enables robots to understand and execute complex human commands for grasping and manipulation across novel objects and environments without requiring task-specific training data.
Abstract
Enabling robots to dexterously grasp and ma- nipulate objects based on human commands is a promis- ing direction in robotics. However, existing approaches are challenging to generalize across diverse objects or tasks due to the limited scale of semantic dexterous grasp datasets. Foundation models offer a new way to enhance generalization, yet directly leveraging them to generate feasible robotic actions remains challenging due to the gap between abstract model knowledge and physical robot execution. To address these challenges, we propose OmniDexGrasp, a generalizable framework that achieves omni-capabilities in user prompting, dex- terous embodiment, and grasping tasks by combining foundation models with the transfer and control strate- gies. OmniDexGrasp integrates three key modules: (i) foundation models are used to enhance generalization by generating human grasp images supporting omni- capability of user prompt and task; (ii) a human- image-to-robot-action transfer strategy converts human demonstrations into executable robot actions, enabling omni dexterous embodiment; (iii) force-aware adaptive *Equal contribution. †Corresponding author. grasp strategy ensures robust and stable grasp execution. Experiments in simulation and on real robots validate the effectiveness of OmniDexGrasp on diverse user prompts, grasp task and dexterous hands, and further results show its extensibility to dexterous manipulation tasks.