← Back ICRA 2026

Adaptive Legged Locomotion Via Online Learning for Model Predictive Control

Hongyu Zhou, Xiaoyu Zhang, Vasileios Tzoumas

PDF

AI summary

Key figure (auto-extracted from paper)

An online learning and model predictive control framework enables quadrupeds to adapt to unknown dynamics and disturbances in real-time, significantly improving trajectory tracking accuracy without offline training.

Legged control online learning model predictive control adaptive control random Fourier features dynamic regret

Problem

Legged robots struggle to maintain accurate trajectory tracking under real-world uncertainties like unknown payloads and uneven terrain, while existing control methods either require costly offline training or make overly conservative worst-case assumptions.

Approach

The method couples model predictive control with online least-squares estimation using random Fourier features to continuously learn and compensate for unknown dynamics and disturbances in real-time.

Key results

Sublinear dynamic regret guarantee against optimal clairvoyant controller
Up to 67% tracking improvement over nominal MPC and 21% over L1-MPC
Robust performance under unknown payloads (8 kg), time-varying friction, and rough/sloped terrain
Real-time online learning of residual dynamics via random Fourier features

Why it matters

It enables reliable, adaptive legged robot operation in unstructured environments without costly offline training, directly benefiting search-and-rescue, logistics, and industrial inspection applications.

Abstract

We provide an algorithm for adaptive legged locomo- tionviaonlinelearningandmodelpredictivecontrol.Thealgorithm is composed of two interacting modules: model predictive control (MPC) and online learning of residual dynamics. The residual dynamicscanrepresentmodelingerrorsandexternaldisturbances. We are motivated by the future of autonomy where quadrupeds will autonomously perform complex tasks despite real-world un- known uncertainty, such as unknown payload and uneven terrains. The algorithm uses random Fourier features to approximate the residual dynamics in reproducing kernel Hilbert spaces. Then, it employs MPC based on the current learned model of the residual dynamics. The model is updated online in a self-supervised manner using least squares based on the data collected while controlling the quadruped. The algorithm enjoys sublinear dynamic regret, defined as the suboptimality against an optimal clairvoyant controller that knows how the residual dynamics. We validate our algorithm in Gazebo and MuJoCo simulations, where the quadruped aims to track reference trajectories. The Gazebo simulations include con- stant unknown external forces up to 12g, where g is the gravity vector, in flat terrain, slope terrain with 20◦inclination, and rough terrain with 0.25 m height variation. The MuJoCo simulations include time-varying unknown disturbances with payload up to 8 kg and time-varying ground friction coefficients in flat terrain.

Index terms

Legged Robots Model Learning for Control Robust/Adaptive Control