SURF-Loco: Mastering Complex Industrial Terrains with 3D Surfel-Based Reinforcement Learning for Legged Robots
Bailin He, Xiting Zhao, Qiao Sun, Xiaoyi Hu, haojie Liu, Jiangwei Zhong, Wenqiang Zhang
AI summary
Problem
Conventional 2D/2.5D perception systems fail to capture intricate 3D geometry like overhangs and grated floors, causing legged robots to struggle with safe navigation in cluttered industrial environments.
Approach
The framework encodes a pre-scanned 3D surfel map into a compact latent context using a VAE, which guides a Mixture-of-Experts reinforcement learning policy to dynamically select terrain-specific control strategies.
Key results
- First surfel-based perception and control framework for legged locomotion
- VAE-compressed geometric context enables dynamic expert selection
- Robust zero-shot sim-to-real transfer on a hexapod robot
- Successful traversal of multi-level industrial obstacles like grated floors
Why it matters
Provides a scalable perception-action pipeline for autonomous legged robots operating in unstructured industrial settings where traditional platforms and 2.5D vision fail.
Abstract
Legged robots offer significant potential for nav- igating complex industrial terrains, but their capabilities are often constrained by perception systems struggling to interpret intricate 3D geometry. Conventional 2D/2.5D representations like depth or elevation maps fail to capture complex 3D geometry, leading to unsafe locomotion. This paper presents SURF-Loco, a novel framework that enables robust legged locomotion by learning directly from a 3D surfel-based model. Our approach uses surfels to create an omnidirectional rep- resentation that explicitly encodes the geometric properties necessary for stable locomotion. We integrate this structured 3D representation into an end-to-end Mixture-of-Experts (MoE) reinforcement learning policy. A variational autoencoder (VAE) distills the complex 3D surroundings into a compact latent context. This geometric context enables a gating network to dynamically select expert sub-policies for agile, context-aware actions. We validate our method on the Lenovo Daystar IS hexapod robot, achieving robust zero-shot sim-to-real transfer on a variety of challenging industrial obstacles.