← Back ICRA 2026

RPG: Robust Policy Gating for Smooth Multi-Skill Transitions in Humanoid Fighting

Xuelong Li, Junbo Tan, Dong Wang

PDF

AI summary

Key figure (auto-extracted from paper)

RPG enables humanoid robots to seamlessly and stably switch between diverse combat skills in real-time, validated through simulation and real-world deployment.

Humanoid robots imitation learning policy switching skill transitions robust control sim-to-real deployment

Problem

Existing imitation learning approaches for humanoid fighting suffer from instability and jerky motions when switching between skills due to mismatched state distributions and out-of-domain disturbances during transitions.

Approach

The framework trains separate expert policies for each fighting skill and applies policy-transition and temporal randomization during training to force robustness against abrupt switches. A lightweight gating network then blends these experts with smoothness regularization to produce fluid, stable multi-skill control.

Key results

Policy-transition and temporal randomization improves robustness to abrupt skill switches
Lightweight gating network enables smooth, real-time fusion of multiple expert policies
Integrated locomotion and combat pipeline supports prolonged, game-like humanoid fighting
Successful sim-to-real transfer on the Unitree G1 validates robust real-world execution

Why it matters

Advances practical whole-body control for humanoids, enabling reliable deployment of complex, dynamic multi-skill behaviors in interactive and real-world applications.

Abstract

Humanoid robots have demonstrated impressive motor skills in a wide range of tasks, yet whole-body control for humanlike long-time, dynamic fighting remains particularly challenging due to the stringent requirements on agility and stability. While imitation learning enables robots to execute human-like fighting skills, existing approaches often rely on switching among multiple single-skill policies or employing a general policy to imitate input reference motions. These strategies suffer from instability when transitioning between skills, as the mismatch of initial and terminal states across skills or reference motions introduces out-of-domain distur- bances, resulting in unsmooth or unstable behaviors. In this work, we propose RPG, a hybrid expert policy framework, for smooth and stable humanoid multi-skills transition. Our approach incorporates motion transition randomization and temporal randomization to train a unified policy that generates agile fighting actions with stability and smoothness during skill transitions. Furthermore, we design a control pipeline that integrates walking/running locomotion with fighting skills, allowing humanlike long-time combat of arbitrary duration that can be seamlessly interrupted or transit action policies at any time. Extensive experiments in simulation demonstrate the effectiveness of the proposed framework, and real-world deployment on the Unitree G1 humanoid robot further validates its robustness and applicability.

Index terms

Human and Humanoid Motion Analysis and Synthesis Imitation Learning Legged Robots