← Back ICRA 2026

TOLEBI: Learning Fault-Tolerant Bipedal Locomotion Via Online Status Estimation and Fallibility Rewards

Hokyun Lee, Woo-Jeong Baek, Junhyeok Cha, Jaeheung Park

PDF

AI summary

Key figure (auto-extracted from paper)

TOLEBI enables humanoid robots to maintain stable locomotion despite sudden joint failures by learning adaptive control policies and estimating joint status in real time.

fault-tolerant locomotion reinforcement learning bipedal robots online state estimation sim-to-real transfer humanoid robotics

Problem

Current reinforcement learning controllers for bipedal robots struggle to handle unexpected hardware faults like joint locking or power loss, often leading to catastrophic falls in real-world settings.

Approach

The framework uses curriculum learning with simulated motor failures and an online GRU-based joint status estimator to train a reinforcement learning policy that dynamically adjusts gait timing and torque commands to recover from faults.

Key results

First learning-based fault-tolerant framework for bipedal locomotion
Online joint status estimator trained concurrently with the control policy
Novel fallibility reward design that preserves natural walking style under motor failures
Successful sim-to-real deployment on the TOCABI humanoid robot for flat walking and stair descent

Why it matters

Enables safer and more reliable deployment of learning-based humanoid robots in real-world environments where hardware failures are unavoidable.

Abstract

With the growing employment of learning algo- rithms in robotic applications, research on reinforcement learn- ing for bipedal locomotion has become a central topic for hu- manoid robotics. While recently published contributions achieve high success rates in locomotion tasks, scarce attention has been devoted to the development of methods that enable to handle hardware faults that may occur during the locomotion process. However, in real-world settings, environmental disturbances or sudden occurrences of hardware faults might yield severe con- sequences. To address these issues, this paper presents TOLEBI: A fault-tolerant learning framework for bipedal locomotion that handles faults on the robot during operation. Specifically, joint locking, power loss and external disturbances are injected in simulation to learn fault-tolerant locomotion strategies. In addi- tion to transferring the learned policy to the real robot via sim- to-real transfer, an online joint status estimator incorporated. This module enables to classify joint conditions by referring to the actual observations at runtime under real-world conditions. The validation experiments conducted both in real-world and simulation with the humanoid robot TOCABI highlight the applicability of the proposed approach. To our knowledge, this work provides the first learning-based fault-tolerant framework for bipedal locomotion, thereby fostering the development of efficient learning methods in this field.

Index terms

Humanoid and Bipedal Locomotion Legged Robots Reinforcement Learning