OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction
Lujie Yang, Xiaoyu Huang, Zhen Wu, Angjoo Kanazawa, Pieter Abbeel, Carmelo Sferrazza, Karen Liu, Yan Duan, Guanya Shi
AI summary
Problem
Existing motion retargeting methods for humanoids struggle with the embodiment gap, producing physically implausible artifacts and neglecting crucial human-object and terrain interactions, which complicates reinforcement learning policy training.
Approach
The framework uses an interaction mesh to explicitly preserve spatial and contact relationships between the robot, objects, and terrain, then applies constrained optimization to generate kinematically feasible trajectories that can be systematically augmented from a single demonstration.
Key results
- Generates over 8 hours of high-quality, interaction-preserving trajectories with superior kinematic constraint satisfaction.
- Enables zero-shot sim-to-real transfer of long-horizon (up to 30s) parkour and loco-manipulation skills on a Unitree G1 humanoid.
- Achieves successful policy training using only five reward terms and four domain randomization parameters shared across all tasks.
- Systematically augments single human demonstrations into diverse scenarios across different robot embodiments, object shapes, and terrains.
Why it matters
It eliminates the data bottleneck and complex reward engineering required for humanoid control, enabling scalable and expressive whole-body skills to be learned and deployed on physical robots.
Abstract
A dominant paradigm for teaching humanoid robots complex skills is to retarget human motions as kine- matic references to train reinforcement learning (RL) policies. However, existing retargeting pipelines often struggle with the significant embodiment gap between humans and robots, producing physically implausible artifacts like foot-skating and penetration. More importantly, common retargeting methods neglect the rich human-object and human-environment interac- tions essential for expressive locomotion and loco-manipulation. To address this, we introduce OMNIRETARGET, an interaction- preserving data generation engine based on an interaction mesh that explicitly models and preserves the crucial spatial and contact relationships between an agent, the terrain, and manipulated objects. By minimizing the Laplacian deformation between the human and robot meshes while enforcing kinematic constraints, OMNIRETARGET generates kinematically feasible trajectories. Moreover, preserving task-relevant interactions en- ables efficient data augmentation, from a single demonstration to different robot embodiments, terrains, and object config- urations. We comprehensively evaluate OMNIRETARGET by retargeting motions from OMOMO [1], LAFAN1 [2], and our in-house MoCap datasets, generating over 8-hour trajectories that achieve better kinematic constraint satisfaction and contact preservation than widely used baselines. Such high-quality data enables proprioceptive RL policies to successfully execute long- horizon (up to 30 seconds) parkour and loco-manipulation skills on a Unitree G1 humanoid, trained with only 5 reward terms and simple domain randomization shared by all tasks, without 1Amazon FAR (Frontier AI & Robotics), 2MIT, 3UC Berkeley, 4Stanford University, 5CMU. * Equal contribution, work done while interning at Amazon FAR. † Amazon FAR team co-lead. any learning curriculum. All code, retargeted datasets, and result videos can be found at https://omniretarget.github.io.