← Back ICRA 2026

Deep Reinforcement Learning for Hip Exoskeleton Control Via Predictive Simulation of Reflex-Based Human Gait

Hossein Barati, Sangdo Kim, Thanh Xuan Nguyen, Jongwon Lee, Young Jin Park

PDF

AI summary

Key figure (auto-extracted from paper)

A deep RL controller trained in a predictive simulation reduces walking metabolic cost by 9.1% on a real hip exoskeleton without requiring experimental human data.

Deep reinforcement learning hip exoskeleton predictive simulation metabolic cost reduction sim-to-real transfer LSTM-PPO

Problem

Conventional exoskeleton controllers lack adaptability and require labor-intensive tuning, while existing RL approaches often depend on experimental reference data, limiting generalizability and complicating sim-to-real transfer.

Approach

The team trains a PPO controller with an LSTM network using a predictive simulation of a reflex-based musculoskeletal model, applying domain randomization to bridge the sim-to-real gap before deploying the policy on physical hardware.

Key results

Reduced metabolic cost of walking by an average of 9.1% in human trials
Outperformed conventional DOFC control (5.8% reduction) under identical conditions
Enabled experiment-free controller training using a predictive reflex-based gait model
Achieved robust sim-to-real transfer via domain randomization and direct kinematic-to-torque mapping

Why it matters

Provides a scalable, data-free framework for developing adaptive exoskeleton controllers that enhance walking efficiency and accelerate real-world deployment.

Abstract

Lower-limb exoskeletons have the potential to enhance mobility and reduce the metabolic cost of walking, while conventional control strategies often lack adaptability and require labor-intensive tuning. Recent advances in reinforce- ment learning (RL) provide new opportunities for generating efficient and personalized assistance. In this study, we propose a predictive simulation framework that integrates a reflex- based musculoskeletal walking model with a hip exoskeleton controller trained using Proximal Policy Optimization (PPO) with a Long Short-Term Memory (LSTM) actor network. The reflex-based model reproduces realistic gait kinematics without relying on experimental motion data, while the LSTM-PPO controller learns to map kinematic states directly to assistive torques. Domain randomization was applied during training to enhance robustness and facilitate sim-to-real transfer. The learned controller was deployed onto a physical hip exoskeleton and evaluated in human subject experiments. Results showed that the LSTM-PPO controller reduced the metabolic cost of walking by an average of 9.1%. These findings highlight the potential of predictive simulation and deep RL for developing intelligent, experiment-free exoskeleton controllers that improve walking efficiency and robustness in real-world conditions.

Index terms

Prosthetics and Exoskeletons Wearable Robotics Reinforcement Learning