Few-Shot Transfer of Tool-Use Skills Using Human Demonstrations with Proximity and Tactile Sensing
Marina Y. Aoyama, Sethu Vijayakumar, Tetsuya Narita
AI summary
Problem
Teaching robots to manipulate grasped tools is hindered by complex intrinsic and extrinsic contact dynamics, limited real-world demonstration data, and the sim-to-real gap.
Approach
Pre-train a sequence-to-sequence policy on simulated primitive motions to learn contact state recognition, then fine-tune it with a few real-world human demonstrations to bridge the domain gap.
Key results
- Enables surface-following with diverse tools using minimal real demonstrations
- Matches human motion and contact force profiles closely
- Combined tactile and proximity sensors effectively identify contact states despite noise
- Latent space captures transferable tool-environment contact relationships
Why it matters
Provides a data-efficient pathway for robots to adapt to new tools and unstructured environments without extensive retraining or physical modeling.
Abstract
Tools extend the manipulation abilities of robots, much like they do for humans. Despite human expertise in tool manipulation, teaching robots these skills faces challenges. The complexity arises from the interplay of two simultaneous points of contact: one between the robot and the tool, and another between the tool and the environment. Tactile and proximity sensors play a crucial role in identifying these complex contacts. However, learning tool manipulation using these sensors remains challenging due to limited real-world data and the large sim-to-real gap. To address this, we propose a few-shot tool- use skill transfer framework using multimodal sensing. The framework involves pre-training the base policy to capture contact states common in tool-use skills in simulation and fine- tuning it with human demonstrations collected in the real-world target domain to bridge the domain gap. We validate that this framework enables teaching surface-following tasks using tools with diverse physical and geometric properties with a small number of demonstrations on the Franka Emika robot arm. Our analysis suggests that the robot acquires new tool-use skills by transferring the ability to recognise tool-environment contact relationships from pre-trained to fine-tuned policies. Additionally, combining proximity and tactile sensors enhances the identifica- tion of contact states and environmental geometry. See our videos at https://sony.github.io/tool-use-few-shot-transfer/.