FreeTacMan: Robot-Free Visuo-Tactile Data Collection System for Contact-Rich Manipulation
Longyan Wu, Checheng Yu, Jieji Ren, Li Chen, Yufei Jiang, Ran Huang, Guoying Gu, Hongyang Li
AI summary
Problem
Collecting high-fidelity, large-scale visuo-tactile datasets for contact-rich robot manipulation is hindered by inefficient data collection setups, limited sensor feedback, and a lack of real-time tactile signals in existing handheld or teleoperation systems.
Approach
FreeTacMan uses a wearable in-situ gripper with integrated visuo-tactile sensors and a high-precision optical motion capture system to record synchronized visual, tactile, and pose data directly from human fingertips without mechanical attenuation.
Key results
- A modular, robot-free wearable gripper enabling direct, real-time tactile feedback with sub-millimeter pose tracking
- A large-scale multimodal dataset comprising over 3,000k visuo-tactile image pairs and 10k trajectories across 50 contact-rich tasks
- Imitation policies trained on the dataset achieve a 50% higher success rate than vision-only baselines on challenging manipulation tasks
- A CLIP-style temporal-aware tactile pretraining strategy that effectively bridges the visual-tactile domain gap for policy learning
Why it matters
It provides an open-source hardware design and scalable dataset that accelerates research in visuo-tactile robot learning and enables more robust, contact-aware manipulation policies.
Abstract
Enabling robots with contact-rich manipulation re- mains a pivotal challenge in robot learning, which is substantially hindered by the data collection gap, including its inefficiency and limited sensor setup. While prior work has explored handheld paradigms, their rod-based mechanical structures remain rigid and unintuitive, providing limited tactile feedback and posing challenges for operators. Motivated by the dexterity and force feedback of human motion, we propose FreeTacMan, a human- centric and robot-free data collection system for accurate and efficient robot manipulation. Concretely, we design a wearable gripper with visuo-tactile sensors for data collection, which can be worn by human fingers for intuitive control. A high- precision optical tracking system is introduced to capture end- effector poses while synchronizing visual and tactile feedback simultaneously. We leverage FreeTacMan to collect a large-scale multimodal dataset, comprising over 3000k paired visuo–tactile images with end-effector poses, 10k demonstration trajectories across 50 diverse contact-rich manipulation tasks. FreeTacMan achieves multiple improvements in data collection performance over prior works and enables effective policy learning from self-collected datasets. By open-sourcing the hardware and the dataset, we aim to facilitate reproducibility and support research in visuo-tactile manipulation.