← Back ICRA 2026

Reinforcement Learning for Stair Locomotion of a Wheeled Bipedal Robot with Contact-Guided Behavior Cloning

Yi Gyeom Kim, Sejik Oh, Hyojin Jo, Dogyun Park, Nam Kyu Kwon

PDF

AI summary

Key figure (auto-extracted from paper)

Dynamically increasing behavior cloning during wheel-stair contact boosts stair-climbing success rates for wheeled bipedal robots without external terrain sensors.

Reinforcement Learning Behavior Cloning Stair Locomotion Wheeled Bipedal Robot Contact-Guided Control PPO

Problem

Stair traversal requires precise leg control during brief, sparse wheel-stair contacts, which pure reinforcement learning struggles to master without extensive reward shaping.

Approach

A contact event-guided PPO-BC framework that dynamically modulates behavior cloning weights based on wheel contact forces, guiding a student policy with a frozen leg-centered teacher policy.

Key results

100% success rate at 15 cm stair height
Outperforms pure PPO and uniform PPO-BC across all tested heights
Achieves stable post-traversal locomotion with minimal reward structure
Operates without external terrain sensors or stair-specific shaping rewards

Why it matters

Provides a sensor-efficient, robust control strategy for hybrid wheeled-legged robots navigating complex urban terrain.

Abstract

No abstract on file.

Index terms

Reinforcement Learning Field Robots