Research Analyzer
← Back ICRA 2026

TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System

Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa, Yan Duan, Pieter Abbeel, Guanya Shi, Jiajun Wu, Karen Liu

PDF

AI summary

Key figure (auto-extracted from paper)
TWIST2 enables scalable, portable, full whole-body humanoid teleoperation and data collection using low-cost VR and a custom neck, facilitating autonomous visuomotor policy learning.
Humanoid robotics Teleoperation Data collection Visuomotor policy Egocentric vision Portable mocap

Problem

Existing humanoid teleoperation systems either lack full whole-body control or depend on expensive, non-portable motion capture setups, limiting scalable data collection for humanoid robots.

Approach

TWIST2 uses a low-cost PICO4U VR headset and ankle trackers for mocap-free whole-body motion capture, paired with a custom 2-DoF active neck for egocentric vision, enabling holistic human-to-robot retargeting and a hierarchical visuomotor policy framework.

Key results

  • Achieves full whole-body control with a portable ~$1000 VR setup and a $250 add-on neck
  • Captures ~100 successful demonstrations in 15-20 minutes with near 100% success rate
  • Trains a hierarchical visuomotor policy for autonomous full-body control using egocentric vision
  • Demonstrates long-horizon dexterous manipulation and dynamic legged tasks like towel folding and kicking

Why it matters

It provides a reproducible, low-cost framework for scalable humanoid data collection and autonomous whole-body control, accelerating progress in humanoid robotics.

Abstract

Large-scale data has driven breakthroughs in robotics, from language models to vision-language-action mod- els in bimanual manipulation. However, humanoid robotics lacks equally effective data collection frameworks. Existing humanoid teleoperation systems either use decoupled control or depend on expensive motion capture setups. We introduce TWIST2, a portable, mocap-free humanoid teleoperation and data collection system that preserves full whole-body control while advancing scalability. Our system leverages PICO4U VR for obtaining real-time whole-body human motions, with a cus- tom 2-DoF robot neck (cost around $250) for egocentric vision, enabling holistic human-to-humanoid control. We demonstrate long-horizon dexterous and mobile humanoid skills and we can collect 100 demonstrations in 15 minutes with an almost 100% success rate. Building on this pipeline, we propose a hierarchical visuomotor policy framework that autonomously controls the full humanoid body based on egocentric vision. Our visuomotor policy successfully demonstrates whole-body dexterous manipu- lation and dynamic kicking tasks. The entire system is fully re- producible and open-sourced at https://yanjieze.com/TWIST2. Our collected dataset is also open-sourced at https://twist- data.github.io.

Index terms

Humanoid Robot Systems Imitation Learning Big Data in Robotics and Automation

Related papers