← Back ICRA 2026

Safe and Efficient Quadrupedal Locomotion with a Chambolle-Pock Whole-Body Controller

Xu Yang, Run Wang, Yiwen Lu, Yilin Mo

PDF

AI summary

Key figure (auto-extracted from paper)

A hierarchical RL-OC framework powered by a Chambolle–Pock QP solver enables safe, energy-efficient quadrupedal locomotion with scalable parallel training and real-time deployment.

Legged locomotion Reinforcement learning Optimal control Whole-body control Chambolle–Pock GPU acceleration

Problem

Existing quadrupedal control strategies struggle to balance the strict safety guarantees of model-based optimization with the scalable, robust policy learning of reinforcement learning, primarily due to the computational bottleneck of running optimization solvers in thousands of parallel simulation environments.

Approach

The authors propose a hierarchical architecture where a high-level RL policy generates reference trajectories and a low-level whole-body controller enforces hard constraints using a novel Chambolle–Pock convex QP solver, designed for massive GPU parallelization during training and fast CPU execution during deployment.

Key results

Hierarchical RL-OC framework bridging robust policy learning with strict constraint satisfaction
Chambolle–Pock-based convex QP solver enabling massive GPU parallelization and real-time CPU deployment
Quantifiable improvements in energy efficiency, tracking accuracy, and constraint satisfaction across simulations and hardware
Simplified policy learning by offloading constraint handling to the WBC, enabling cross-platform transferability

Why it matters

Provides a scalable, safety-guaranteed control paradigm for researchers and engineers developing agile, energy-efficient legged robots for real-world deployment.

Abstract

This article presents a hierarchical control frame- work for quadrupedal locomotion that unifies the complementary strengths of model-based optimization and reinforcement learn- ing. We develop a convex quadratic programming (QP) solver based on the primal-dual Chambolle–Pock algorithm, enabling both massively parallel policy training and real-time deployment through efficient handling of constrained optimization problems. Our hierarchical framework employs learned policies for robust high-level control to handle real-world perturbations, while en- suring instantaneous constraint satisfaction and energy efficiency through a low-level whole-body controller powered by the pro- posed solver. Extensive benchmarks and experimental validation demonstrate quantifiable improvements in energy consumption, constraint satisfaction, and task transferability across simulated and real-world environments.

Index terms

Legged Robots Optimization and Optimal Control Deep Learning in Robotics and Automation Reinforcement Learning