← Back ICRA 2026

Uncertainty Comes for Free: Human-In-The-Loop Policies with Diffusion Models

Zhanpeng He, Yifeng Cao, Matei Ciocarlie

PDF

AI summary

Key figure (auto-extracted from paper)

By leveraging diffusion model denoising as a free uncertainty metric, robots can strategically request human assistance only when necessary, drastically reducing operator burden while boosting task success.

Diffusion policies Human-in-the-loop Uncertainty estimation Robot teleoperation Policy fine-tuning Autonomous deployment

Problem

Continuous human monitoring in human-in-the-loop robot deployment is labor-intensive and impractical for scaling, yet existing methods lack efficient, training-free uncertainty estimation to strategically trigger assistance at deployment time.

Approach

The method computes an uncertainty metric from the diffusion policy's denoising vector field to decide when to request teleoperation, and uses the resulting intervention data to efficiently fine-tune the policy without requiring expert interaction during training.

Key results

Achieves 100% task success with fewer human intervention steps than baselines
Requires no human interaction during training and minimal deployment overhead
Enables efficient policy fine-tuning using targeted teleoperation data
Demonstrates robustness across distribution shift, partial observability, and multi-modality

Why it matters

It enables scalable, practical human-in-the-loop robot deployment by minimizing operator fatigue while maintaining high autonomy and success rates.

Abstract

Human-in-the-loop robot deployment has gained significant attention in both academia and industry as a semi- autonomous paradigm that enables human operators to intervene and adjust robot behaviors at deployment time, improving success rates. However, continuous human monitoring and intervention can be highly labor-intensive and impractical when deploying a large number of robots. To address this limitation, we propose a method that allows diffusion policies to actively seek human assistance only when necessary, reducing reliance on constant human oversight. To achieve this, we leverage the generative process of diffusion policies to compute an uncertainty-based metric based on which the autonomous agent can decide to request operator assistance at deployment time, without requiring any operator interaction during training. Additionally, we show that the same method can be used for efficient data collection for fine-tuning diffusion policies in order to improve their autonomous performance. Experimental results from simulated and real-world environments demonstrate that our approach enhances policy performance during deployment for a variety of scenarios.

Index terms

Human Factors and Human-in-the-Loop Imitation Learning Deep Learning in Grasping and Manipulation