Learning Dexterous Manipulation Skills from Imperfect Simulations
Elvis Hsieh, Wen-Han Hsieh, Yen-Jen Wang, Toru Lin, Jitendra Malik, Koushil Sreenath, Haozhi Qi
AI summary
Problem
Simulating complex contact dynamics and tactile feedback is difficult, while direct teleoperation of multi-fingered hands is challenging due to morphology differences between humans and robots.
Approach
The DexScrew framework trains RL policies on simplified simulation models to learn rotational gaits, uses these as primitives for assisted teleoperation to collect real-world data, and then trains a behavior cloning policy using tactile feedback.
Key results
- Successful nut-bolt fastening and screwdriving with multi-fingered hands
- Higher task progress ratios compared to direct sim-to-real transfer
- Generalization across diverse object geometries including square, triangular, hexagonal, and cross-shaped nuts
- Demonstrated that combining tactile sensing and temporal history yields the highest accuracy and fastest execution
Why it matters
Provides a scalable path toward autonomous dexterous manipulation in unstructured environments without requiring high-fidelity physics simulations.
Abstract
Reinforcement learning and sim-to-real transfer have made significant progress in dexterous manipulation. How- ever, progress remains limited by the difficulty of simulating complex contact dynamics and multisensory signals, especially tactile feedback. In this work, we propose DexScrew, a sim- to-real framework that addresses these limitations and demon- strates its effectiveness on nut-bolt fastening and screwdriving with multi-fingered hands. The framework has three stages. First, we train reinforcement learning policies in simulation using simplified object models that lead to the emergence of correct finger gaits. We then use the learned policy as a skill primitive within a teleoperation system to collect real- world demonstrations that contain tactile and proprioceptive information. Finally, we train a behavior cloning policy that incorporates tactile sensing and show that it generalizes to nuts and screwdrivers with diverse geometries. Experiments across both tasks show high task progress ratios compared to direct sim-to-real transfer and robust performance even on unseen object shapes and under external perturbations.