← Back ICRA 2026

Learning Collision-Free Object Goal Pushing for Quadruped Robots with Safe Corridors

Gabriel Lai, Yi Wong, Chung Yui Yeung, Shaohang Xu, Zhi Chen, Chin Pang Ho

PDF

AI summary

Key figure (auto-extracted from paper)

A low-dimensional safe corridor representation enables efficient, collision-free object pushing by quadrupedal robots in cluttered environments without processing raw sensor data.

quadruped robots non-prehensile manipulation reinforcement learning safe corridors collision avoidance sim-to-real

Problem

Existing reinforcement learning methods for quadrupedal pushing largely ignore obstacle avoidance in cluttered spaces, while traditional sensor-based avoidance requires computationally expensive training pipelines.

Approach

The authors integrate a low-dimensional safe corridor into the RL observation and constraint design, allowing the policy to learn collision-free pushing efficiently without processing high-dimensional LiDAR or camera data.

Key results

Novel constrained RL framework for collision-free object pushing in cluttered environments
Low-dimensional safe corridor representation replaces high-dimensional sensor inputs for efficient training
Policy trains in approximately one hour on a consumer GPU and transfers directly to a real Unitree Go2 robot
Demonstrates robust sim-to-real performance across varied cluttered scenarios

Why it matters

Provides a computationally efficient pathway for legged robots to safely perform non-prehensile manipulation in real-world cluttered spaces.

Abstract

While recent advancements in reinforcement learning have enabled quadrupedal robots to perform non- prehensile manipulation tasks like pushing, existing methods have largely overlooked the critical challenge of obstacle avoid- ance. In this paper, we address this significant limitation by introducing a novel reinforcement learning (RL) framework that controls a quadrupedal robot to push large objects in cluttered, real-world environments. In particular, obstacle avoidance is integrated as a primary objective directly into the policy training process. To achieve this, we propose to represent the traversable space with a low-dimensional safe corridor, a method that is both computationally efficient and highly effective. This approach avoids the need for complex and resource-intensive training pipelines typically required for processing high-dimensional sensor data. We validate our policy through extensive experiments in both simulation and the real world. The implementation code will be released to benefit the research community.

Index terms

Reinforcement Learning Deep Learning in Grasping and Manipulation Collision Avoidance