Research Analyzer
← Back ICRA 2026

Diffusion Policy for Robot-Assisted Dressing with Moving Human Arms

Haoxiang Sun, David Navarro-Alarcon

PDF

AI summary

Key figure (auto-extracted from paper)
A diffusion-based visuomotor policy combined with real-time point cloud registration enables robots to successfully dress users with dynamically moving arms.
robot-assisted dressing diffusion policy point cloud registration dynamic human motion visuomotor control human-robot interaction

Problem

Robot-assisted dressing struggles with deformable garments, visual occlusions, and the restrictive assumption that users must remain static, limiting natural interaction. Existing methods lack robust, real-time adaptation to dynamic human arm movements.

Approach

The system uses a hierarchical vision-based framework where a diffusion model learns action distributions from point clouds, while a local axial scalar field and point cloud registration continuously adapt the robot's trajectory to the user's moving arm in real time.

Key results

  • High success rates across 3 garment types, 4 body types, and 10 dynamic motion scenarios
  • Outperforms diffusion, imitation learning, and MPC baselines in sleeve insertion and dressing ratio metrics
  • Enables real-time trajectory adaptation to non-static arm movements without full reconstruction
  • Generalizes to unseen human poses and garment configurations beyond expert demonstrations

Why it matters

Makes robot-assisted dressing more practical and comfortable for daily living by accommodating natural human movement during interaction.

Abstract

Robot-assisted dressing remains challenging due to the close physical human–robot interaction and the highly deformable nature of garments. This work presents a purely vision-based approach that transfers human-mastered dress- ing skills to robots while accommodating dynamic human arm movements. The proposed method adopts a hierarchical structure. At the high level, a diffusion model serves as the policy to learn action distributions conditioned on point cloud observations. During execution, a diffused scalar field is constructed to infer an object-centric axial distribution of the human arm from cluttered points. Local point cloud registration across consecutive frames further captures arm motion, enabling real-time adaptation of robot actions to user dynamics. Comprehensive evaluations have been conducted in both simulation and real-world dressing scenarios using a UR10e robot with human participants of diverse genders and body types.

Index terms

Human-Centered Robotics Imitation Learning Physical Human-Robot Interaction

Related papers