VOCALoco: Viability-Optimized Cost-Aware Adaptive Locomotion
Stanley Wu, Mohamad Hosein Danesh, Simon Li, Hanna Yurchyk, Amin Abyaneh, Anas El Houssaini, David Paul Meger, Hsiu-Chin Lin
AI summary
Problem
Existing end-to-end deep reinforcement learning policies for legged robots lack safety guarantees, interpretability, and robust generalization to novel, unstructured terrains.
Approach
The framework uses a terrain heightmap to predict the safety and energy cost of multiple pre-trained locomotion policies, dynamically selecting the safest and most energy-efficient option for the current terrain.
Key results
- Predicts policy viability and energy cost from terrain heightmaps
- Modular design enables interpretability and easy skill extension
- Fully simulation-trained with automated synthetic data generation
- Outperforms end-to-end DRL in stair traversal robustness and safety (sim & real-world)
Why it matters
Enables safer, more interpretable, and energy-efficient deployment of legged robots in complex, unstructured environments for applications like search-and-rescue and exploration.
Abstract
Recent advancements in legged robot locomotion have facilitated traversal over increasingly complex terrains. Despite this progress, many existing approaches rely on end- to-end deep reinforcement learning (DRL), which poses limi- tations in terms of safety and interpretability, especially when generalizing to novel terrains. To overcome these challenges, we introduce VOCALoco, a modular skill-selection framework that dynamically adapts locomotion strategies based on perceptual input. Given a set of pre-trained locomotion policies, VOCALoco evaluates their viability and energy-consumption by predicting both the safety of execution and the anticipated cost of transport over a fixed planning horizon. This joint assessment enables the selection of policies that are both safe and energy-efficient, given the observed local terrain. We evaluate our approach on staircase locomotion tasks, demonstrating its performance in both simulated and real-world scenarios using a quadrupedal robot. Empirical results show that VOCALoco achieves improved robustness and safety during stair ascent and descent compared to a conventional end-to-end DRL policy.