Learning Safe Locomotion for Quadrupedal Robots by Derived-Action Optimization
Deye Zhu, Chengrui Zhu, Zhen Zhang, Shuo Xin, Yong Liu
Abstract
Deep reinforcement learning controllers with exte- roception have enabled quadrupedal robots to traverse terrain robustly. However, most of these controllers heavily depend on complex reward functions and suffer from poor convergence. This work proposes a novel learning framework called derived- action optimization. The derived action is defined as a high- level representation of a policy and can be introduced into the reward function to guide decision-making behaviors. The proposed derived-action optimization method is applied to learn safer quadrupedal locomotion, achieving fast convergence and better performance. Specifically, we choose the foothold as the derived action and optimize the flatness of the terrain around the foothold to reduce potential sliding and collisions. Extensive experiments demonstrate the high safety and effectiveness of our method.