Research Analyzer
← Back ICRA 2026

Agile Hauler Curriculum: Learning High-Speed Locomotion for Robots under Demanding Payloads

Yawen Zhou, Haopeng Tang, Huiqiao Fu, Peng Li

PDF

AI summary

Key figure (auto-extracted from paper)
A novel Elo-based curriculum enables legged robots to run faster and more efficiently under heavy payloads by dynamically focusing training on the performance frontier.
Legged robots Curriculum learning Payload locomotion Reinforcement learning Sim-to-real transfer Energy efficiency

Problem

Legged robots struggle to balance high agility, speed, and energy efficiency when carrying substantial payloads, as existing curriculum learning methods either converge to redundant training or fail to adapt to dynamic load demands.

Approach

The Agile Hauler Curriculum (AHC) frames training as a multiplayer game using an Elo rating system to continuously sample challenging velocity tasks at the agent's performance frontier, paired with staged payload progression and energy-aware reward gating.

Key results

  • 2.42 m/s max speed with 12 kg payload on real-world Go2 robot
  • 46.7% speed increase and 20.5% energy reduction vs. baseline curriculum
  • Successful zero-shot sim-to-real transfer under varying heavy loads
  • Dynamic task sampling avoids redundant training on mastered low-speed commands

Why it matters

Provides a scalable, model-free framework for deploying agile, energy-efficient legged robots in high-demand logistics and disaster response applications.

Abstract

Dynamic control for legged robots confronts a fundamental trilemma, where concurrent demands for high agility, substantial payload capacity and energy efficiency impose deeply coupled and often conflicting constraints. We introduce the Agile Hauler Curriculum (AHC), a learning- based method that bypasses complex mathematical modeling to address this problem. The core of AHC is an Elo-based dual- axis dynamic sampling curriculum that continuously focuses training on the agent’s performance frontier, systematically pushing the robot’s agility-payload performance envelope while an energy-aware gating mechanism ensures efficiency. In real- world deployment on the Go2 robot, the AHC-trained policy achieved a max speed of 2.42 m/s with a 12 kg payload, representing a 46.7% increase in speed and a 20.5% average reduction in energy consumption compared to standard grid adaptive curriculum.

Index terms

Legged Robots Motion Control Reinforcement Learning

Related papers