← Back ICRA 2026

NaviGait: Navigating Dynamically Feasible Gait Libraries Using Deep Reinforcement Learning

Neil Janwani, Varun Madabushi, Maegan Tucker

PDF

AI summary

Key figure (auto-extracted from paper)

NAVIGAIT merges offline trajectory-optimized gait libraries with residual deep reinforcement learning to achieve robust, natural-looking bipedal locomotion with faster training and simpler reward design.

Bipedal locomotion Reinforcement learning Gait libraries Trajectory optimization Residual control Sim-to-real transfer

Problem

Trajectory optimization yields interpretable but brittle gaits, while reinforcement learning offers robustness but suffers from complex reward design, long training times, and opaque policies.

Approach

The framework uses RL to select and smoothly interpolate between precomputed reference gaits while applying minimal joint-level corrections for stabilization and task adaptation.

Key results

Simplifies reward design by decoupling high-level motion from low-level correction
Accelerates training speed compared to conventional and imitation-based RL baselines
Maintains robust disturbance rejection and velocity tracking comparable to state-of-the-art methods
Enables easy behavioral tuning and style transfer without altering controller structure

Why it matters

Offers a scalable and interpretable control paradigm for bipedal robots that bridges model-based planning and data-driven learning.

Abstract

Reinforcement learning (RL) has emerged as a powerful method to learn robust control policies for bipedal locomotion. Yet, it can be difficult to tune desired robot behaviors due to unintuitive and complex reward design. In comparison, trajectory optimization-based methods offer more tuneable, interpretable, and mathematically grounded motion plans for high-dimensional legged systems. However, these methods often remain brittle to real-world disturbances like external perturbations. In this work, we present NAVIGAIT, a hierarchical framework that combines the structure of tra- jectory optimization with the adaptability of RL for robust and intuitive locomotion control. NAVIGAIT leverages RL to synthesize new motions by selecting, minimally morphing, and stabilizing gaits taken from an offline-generated gait library. NAVIGAIT results in walking policies that match the reference motion well while maintaining robustness comparable to other locomotion controllers. Additionally, the structure imposed by NAVIGAIT drastically simplifies the RL reward composition. Our experimental results demonstrate that NAVIGAIT enables faster training compared to conventional and imitation-based RL, and produces motions that remain closest to the original reference. Overall, by decoupling high-level motion generation from low-level correction, NAVIGAIT offers a more scalable and generalizable approach for achieving dynamic and robust loco- motion. Videos and the full framework are publicly available at dynamicmobility.github.io/navigait.

Index terms

Humanoid and Bipedal Locomotion Reinforcement Learning Machine Learning for Robot Control