← Back ICRA 2026

H-Zero: Cross-Humanoid Locomotion Pretraining Enables Few-Shot Novel Embodiment Transfer

Yunfeng Lin, Minghuan Liu, Yufei Xue, Ming Zhou, Yong Yu, Jiangmiao Pang, Weinan Zhang

PDF

AI summary

Key figure (auto-extracted from paper)

Pretraining a locomotion policy on multiple diverse humanoid robots enables rapid few-shot adaptation to novel, unseen robots with minimal fine-tuning time.

Cross-embodiment learning Locomotion pretraining Few-shot transfer Humanoid robotics Reinforcement learning Sim-to-real transfer

Problem

Existing RL-based locomotion controllers are tightly coupled to specific robot morphologies, requiring extensive, resource-intensive retraining and tuning for each new hardware design.

Approach

H-Zero introduces a cross-embodiment pretraining pipeline that uses unified control semantics, embodiment descriptors, and extended domain randomization across a mixed set of real robot models to learn a generalizable base policy.

Key results

Unified control interface standardizes joint semantics across diverse morphologies
Pretrained policy maintains up to 81% episode duration on unseen simulated robots
Few-shot fine-tuning achieves stable locomotion on novel humanoids and quadrupeds within 30 minutes
Demonstrates consistent sim-to-real transfer across multiple physical platforms

Why it matters

Enables scalable, hardware-agnostic controller development for the rapidly growing ecosystem of customized humanoid robots, drastically reducing deployment time and computational costs.

Abstract

The rapid advancement of humanoid robotics has intensified the need for robust and adaptable controllers to enable stable and efficient locomotion across diverse platforms. However, developing such controllers remains a significant challenge because existing solutions are tailored to specific robot designs, requiring extensive tuning of reward functions, physical parameters, and training hyperparameters for each embodiment. To address this challenge, we introduce H-Zero, a cross-humanoid locomotion pretraining pipeline that learns a generalizable humanoid base policy. We show that pretraining on a limited set of embodiments enables zero-shot and few-shot transfer to novel humanoid robots with minimal fine-tuning. Evaluations show that the pretrained policy maintains up to 81% of the full episode duration on unseen robots in simulation while enabling few-shot transfer to unseen humanoids and upright quadrupeds within 30 minutes of fine-tuning.

Index terms

Humanoid and Bipedal Locomotion Transfer Learning Reinforcement Learning