← Back ICRA 2026

URPlanner: A Universal Paradigm for Collision-Free Robotic Motion Planning Based on Deep Reinforcement Learning

Fengkang Ying, Hanwen Zhang, Haozhe Wang, Huishi Huang, Marcelo H Ang Jr

PDF

AI summary

Key figure (auto-extracted from paper)

URPlanner enables cost-effective, IK-free collision-free motion planning for arbitrary redundant manipulators using a universal reward and augmented DRL training.

Collision-free motion planning Deep reinforcement learning Universal obstacle avoidance reward Augmented policy exploration Expert data diffusion Redundant manipulators

Problem

Existing DRL-based motion planners for redundant manipulators are computationally costly, overly dependent on minimum distance calculations, and suffer from poor exploration and inefficient data utilization.

Approach

The method parameterizes the environment using bounding boxes and line segments to create a distance-independent obstacle avoidance reward, combined with an enhanced exploration algorithm and a data diffusion strategy to train policies from minimal expert demonstrations.

Key results

Universal obstacle avoidance reward independent of minimum distance
Augmented policy exploration and evaluation algorithm for stable DRL training
Expert data diffusion strategy generating large datasets from few demonstrations
Platform-agnostic planning that bypasses inverse kinematics for arbitrary manipulators

Why it matters

Provides a scalable, deployment-ready framework for training collision-free motion policies on any redundant robot without costly simulations or custom inverse kinematics.

Abstract

Collision-free motion planning for redundant robot manipulators in complex environments is yet to be explored. Although recent advancements at the intersection of deep re- inforcement learning (DRL) and robotics have highlighted its potential to handle versatile robotic tasks, current DRL-based collision-free motion planners for manipulators are highly costly, hindering their deployment and application. This is due to an overreliance on the minimum distance between the manipula- tor and obstacles, inadequate exploration and decision-making by DRL, and inefficient data acquisition and utilization. In this article, we propose URPlanner, a universal paradigm for collision-free robotic motion planning based on DRL. URPlanner offers several advantages over existing approaches: it is platform- agnostic, cost-effective in both training and deployment, and applicable to arbitrary manipulators without solving inverse kinematics. To achieve this, we first develop a parameterized task space and a universal obstacle avoidance reward that is independent of minimum distance. Second, we introduce an augmented policy exploration and evaluation algorithm that can be applied to various DRL algorithms to enhance their performance. Third, we propose an expert data diffusion strategy for efficient policy learning, which can produce a large-scale trajectory dataset from only a few expert demonstrations. Finally, the superiority of the proposed methods is comprehensively verified through experiments.

Index terms

Motion and Path Planning Collision Avoidance AI-Based Methods Industrial Robots