← Back ICRA 2026

DemoDiffusion: One-Shot Human Imitation Using Pre-Trained Diffusion Policy

Sungjae Park, Homanga Bharadhwaj, Shubham Tulsiani

PDF

AI summary

Key figure (auto-extracted from paper)

DemoDiffusion enables robots to execute novel tasks from a single human demonstration by refining kinematically retargeted trajectories with a pre-trained diffusion policy, achieving an 83.8% real-world success rate without task-specific training.

One-shot imitation diffusion policy kinematic retargeting robot manipulation human demonstration generalist policies

Problem

Generalist robot policies struggle with zero-shot deployment in novel environments, while existing one-shot imitation methods rely on brittle kinematic retargeting or require costly online reinforcement learning and paired human-robot data.

Approach

The method extracts 3D hand poses from a human video, converts them to an open-loop robot trajectory via kinematic retargeting, and then uses a pre-trained diffusion policy to iteratively denoise and refine this trajectory into feasible, closed-loop robot actions.

Key results

83.8% average success rate across 8 real-world manipulation tasks
Surpasses base diffusion policy (13.8%) and kinematic retargeting (52.5%) in real-world tests
Successfully executes tasks where the pre-trained generalist policy fails entirely
Robust performance in simulation dexterous grasping across varying object sizes

Why it matters

It provides a practical, low-effort deployment pathway for generalist robot policies in unstructured environments, making one-shot human imitation accessible to non-expert users without requiring task-specific data collection or online training.

Abstract

We propose DemoDiffusion, a simple method for enabling robots to perform manipulation tasks by imitating a single human demonstration, without requiring task-specific training or paired human-robot data. Our approach is based on two insights. First, the hand motion in a human demon- stration provides a useful prior for the robot’s end-effector trajectory, which we can convert into a rough open-loop robot motion trajectory via kinematic retargeting. Second, while this retargeted motion captures the overall structure of the task, it may not align well with plausible robot actions in-context. To address this, we leverage a pre-trained generalist diffusion policy to modify the trajectory, ensuring it both follows the human motion and remains within the distribution of plausible robot actions. Unlike approaches based on online reinforcement learning or paired human-robot data, our method enables robust adaptation to new tasks and scenes with minimal effort. In real-world experiments across 8 diverse manipulation tasks, DemoDiffusion achieves 83.8% average success rate, compared to 13.8% for the pre-trained policy and 52.5% for kinematic retargeting, succeeding even on tasks where the pre-trained generalist policy fails entirely. Project page: https://demodiffusion.github.io/

Index terms

Deep Learning in Grasping and Manipulation Dexterous Manipulation Learning from Demonstration