CMoE: Contrastive Mixture of Experts for Motion Control and Terrain Adaptation of Humanoid Robots
Shihao Ma, Hongjin Chen, Zijun Xu, Yi Zhao, Ke Wu, Ruichen Yang, Leyao Zou, Zhongxue Gan, Wenchao Ding
AI summary
Problem
Vanilla Mixture of Experts models suffer from 'lazy gating,' where expert activations remain nearly uniform across different terrains, preventing effective terrain specialization and limiting adaptability in complex, heterogeneous environments.
Approach
The authors propose CMoE, a single-stage reinforcement learning framework that combines a Mixture of Experts policy with a contrastive learning objective to align expert activation distributions with terrain-specific features, encouraging clear specialization.
Key results
- Achieves state-of-the-art success rates and travel distances across 8 diverse terrains in simulation.
- Enables traversal of 20 cm continuous steps and 80 cm gaps on a physical Unitree G1 robot.
- Establishes clear, terrain-specific clustering of expert activations via contrastive learning.
- Provides a publicly released codebase for community adoption.
Why it matters
It provides a scalable, single-stage training paradigm that significantly improves real-world humanoid locomotion on complex, heterogeneous terrains, benefiting robotics researchers and developers.
Abstract
For effective deployment in real-world environ- ments, humanoid robots must autonomously navigate a diverse range of complex terrains with abrupt transitions. While the Vanilla mixture of experts (MoE) framework is theoretically ca- pable of modeling diverse terrain features, in practice, the gat- ing network exhibits nearly uniform expert activations across different terrains, weakening the expert specialization and limiting the model’s expressive power. To address this limitation, we introduce CMoE, a novel single-stage reinforcement learning framework that integrates contrastive learning to refine expert activation distributions. By imposing contrastive constraints, CMoE maximizes the consistency of expert activations within the same terrain while minimizing their similarity across 1College of Intelligent Robotics and Advanced Manufacturing, Fudan University, Shanghai, China, 200433 This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 62403142, in part by the Science and Technology Commission of Shanghai Municipality under Grant 24511103100, and in part by the Shanghai Municipal Science and Technology Major Project (No. 2021SHZDZX0103). ∗Corresponding authors: Wenchao Ding and Zhongxue Gan. Project Page: https://hoshi-no-ai.github.io/CMoE different terrains, thereby encouraging experts to specialize in distinct terrain types. We validated our approach on the Unitree G1 humanoid robot through a series of challenging experiments. Results demonstrate that CMoE enables the robot to traverse continuous steps up to 20 cm high and gaps up to 80 cm wide, while achieving robust and natural gait across diverse mixed terrains, surpassing the limits of existing methods. To support further research and foster community development, we will release our code publicly.