← Back ICRA 2026

Contact-Safe Reinforcement Learning with ProMP Reparameterization and Energy Awareness

Bingkun Huang, Yuhe Gong, Zewen Yang, Tianyu REN, Luis Figueredo

PDF

AI summary

Key figure (auto-extracted from paper)

The PPT framework enables safe, smooth, and robust contact-rich robotic manipulation by combining RL-based trajectory adaptation with energy-tank passivity control.

Reinforcement Learning ProMP Energy Tank Contact-Rich Manipulation Passivity

Problem

Traditional reinforcement learning often produces non-smooth policies and lacks explicit safety guarantees for the discontinuous dynamics and transient forces inherent in contact-rich manipulation tasks.

Approach

The method uses PPO to optimize Probabilistic Movement Primitive (ProMP) weights for smooth task-space trajectories, which are then filtered through an energy-tank passivity layer and executed via a Cartesian impedance controller.

Key results

High success rates in box pushing and maze sliding tasks
Smoother trajectories with lower jerk compared to step-wise RL policies
Reduced peak interaction power and safer force regulation via the energy-tank mechanism
Successful validation in simulation and on a Franka Panda robot across various surfaces

Why it matters

It provides a principled way to integrate data-driven robustness with trajectory smoothness and physical safety for robots interacting with unknown environments.

Abstract

Reinforcement learning (RL) approaches based on Markov Decision Processes (MDPs) are predominantly applied in the robot joint space, often relying on limited task-specific information and partial awareness of the 3D environment. In contrast, episodic RL has demonstrated advantages over traditional MDP-based methods in terms of trajectory con- sistency, task awareness, and overall performance in complex robotic tasks. Moreover, traditional step-wise and episodic RL methods often neglect the contact-rich information inherent in task-space manipulation, especially considering the contact- safety and robustness. In this work, contact-rich manipulation tasks are tackled using a task-space, energy-safe framework, where reliable and safe task-space trajectories are generated through the combination of Proximal Policy Optimization (PPO) and movement primitives. Furthermore, an energy- aware Cartesian Impedance Controller objective is incorporated within the proposed framework to ensure safe interactions between the robot and the environment. Our experimental results demonstrate that the proposed framework outperforms existing methods in handling tasks on various types of surfaces in 3D environments, achieving high success rates as well as smooth trajectories and energy-safe interactions.

Index terms

Machine Learning for Robot Control Robot Safety Robust/Adaptive Control