← Back ICRA 2026

VOCALoco: Viability-Optimized Cost-Aware Adaptive Locomotion

Stanley Wu, Mohamad Hosein Danesh, Simon Li, Hanna Yurchyk, Amin Abyaneh, Anas El Houssaini, David Paul Meger, Hsiu-Chin Lin

PDF

AI summary

Key figure (auto-extracted from paper)

VOCALoco dynamically selects the safest and most energy-efficient locomotion policy based on terrain perception, outperforming end-to-end deep reinforcement learning in stair traversal.

Legged Robots Adaptive Locomotion Skill Selection Viability Prediction Cost of Transport

Problem

Existing end-to-end deep reinforcement learning policies for legged robots lack safety guarantees, interpretability, and robust generalization to novel, unstructured terrains.

Approach

The framework uses a terrain heightmap to predict the safety and energy cost of multiple pre-trained locomotion policies, dynamically selecting the safest and most energy-efficient option for the current terrain.

Key results

Predicts policy viability and energy cost from terrain heightmaps
Modular design enables interpretability and easy skill extension
Fully simulation-trained with automated synthetic data generation
Outperforms end-to-end DRL in stair traversal robustness and safety (sim & real-world)

Why it matters

Enables safer, more interpretable, and energy-efficient deployment of legged robots in complex, unstructured environments for applications like search-and-rescue and exploration.

Abstract

Recent advancements in legged robot locomotion have facilitated traversal over increasingly complex terrains. Despite this progress, many existing approaches rely on end- to-end deep reinforcement learning (DRL), which poses limi- tations in terms of safety and interpretability, especially when generalizing to novel terrains. To overcome these challenges, we introduce VOCALoco, a modular skill-selection framework that dynamically adapts locomotion strategies based on perceptual input. Given a set of pre-trained locomotion policies, VOCALoco evaluates their viability and energy-consumption by predicting both the safety of execution and the anticipated cost of transport over a fixed planning horizon. This joint assessment enables the selection of policies that are both safe and energy-efficient, given the observed local terrain. We evaluate our approach on staircase locomotion tasks, demonstrating its performance in both simulated and real-world scenarios using a quadrupedal robot. Empirical results show that VOCALoco achieves improved robustness and safety during stair ascent and descent compared to a conventional end-to-end DRL policy.

Index terms

Legged Robots Integrated Planning and Learning Reinforcement Learning