Phase-Aware Policy Learning for Skateboard Riding of Quadruped Robots Via Feature-Wise Linear Modulation
Minsung Yoon, Jeil Jeong, Sung-eui Yoon
AI summary
Problem
Controlling quadruped robots on skateboards requires managing multi-modal dynamics and seamless phase transitions while handling partial observability, which existing methods struggle with due to simplified assumptions and lack of exteroceptive feedback.
Approach
The authors introduce Phase-Aware Policy Learning (PAPL), which uses a phase clock and FiLM-modulated neural networks to unify phase-specific behaviors in a single policy, combined with privileged learning and visual estimators to handle partial observability.
Key results
- Unified phase-conditioned policy enabling seamless pushing, carving, and transition phases
- High command-tracking accuracy and improved locomotion efficiency over legged baselines
- Successful sim-to-real transfer across diverse conditions using belly-mounted camera feedback
- Ablation studies quantifying the impact of FiLM modulation and privileged learning components
Why it matters
Enables energy-efficient, long-range quadruped locomotion by integrating passive mobility devices, advancing practical real-world deployment of legged robots.
Abstract
Skateboards offer a compact and efficient means of transportation as a type of personal mobility device. However, controlling them with legged robots poses several challenges for policy learning due to perception-driven interactions and multi- modal control objectives across distinct skateboarding phases. To address these challenges, we introduce Phase-Aware Policy Learning (PAPL), a reinforcement-learning framework tailored for skateboarding with quadruped robots. PAPL leverages the cyclic nature of skateboarding by integrating phase-conditioned Feature-wise Linear Modulation layers into actor and critic net- works, enabling a unified policy that captures phase-dependent behaviors while sharing robot-specific knowledge across phases. Our evaluations in simulation validate command-tracking accu- racy and conduct ablation studies quantifying each component’s contribution. We also compare locomotion efficiency against leg and wheel–leg baselines and show the real-world transferability.