← Back ICRA 2026

Demonstration-Augmented Deep Reinforcement Learning with Mixed Reality Human-In-The-Loop Guidance

Mohammad-Ehsan Matour, Alexander Winkler

PDF

AI summary

Key figure (auto-extracted from paper)

Real-time mixed reality demonstrations significantly accelerate deep reinforcement learning convergence, improve stability, and bridge the simulation-to-reality gap for robotic tasks.

Mixed reality Deep reinforcement learning Human-in-the-loop Demonstration augmentation Sim-to-real transfer Robotic manipulation

Problem

Direct deep reinforcement learning on physical robots suffers from high sample requirements, instability, and the simulation-to-reality gap, while existing demonstration-augmented methods rely on static datasets or misaligned virtual reality setups.

Approach

The framework injects real-time expert annotations captured via mixed reality into a modified replay buffer, using threshold-based activation and a decaying expert-to-agent sampling ratio to guide policy updates.

Key results

Accelerated policy convergence through real-time MR guidance
Enhanced stability under noisy expert annotations
Strong generalization to unseen task configurations
Reduced simulation-to-reality gap on physical hardware

Why it matters

Enables data-efficient and robust robot training in real-world settings by making human expertise directly actionable during reinforcement learning.

Abstract

The integration of human expertise into reinforce- ment learning has gained increasing attention as a means to im- prove sample efficiency and stability. Current approaches often depend on pre-collected expert demonstrations or virtual reality setups, which are costly to generate and difficult to adapt to dynamic training conditions. In this work, a framework is intro- duced that augments deep reinforcement learning with real-time demonstrations provided through mixed reality interaction. A structured robotic pick-and-place task serves as the benchmark, where a robot must execute sequential phases of grasping, transporting, and releasing an object. Expert guidance is delivered via mixed reality annotations, which are converted into reference trajectories and injected into the learning process whenever performance falls below a predefined threshold. A modified replay buffer accommodates both agent-generated and expert-generated transitions, allowing controlled sampling with a dynamically adjusted expert-to-agent ratio. Training in the real workspace through mixed reality reduces the simulation- to-reality gap considerably, as confirmed by experiments on a physical robot platform. Experimental evaluation demonstrates that the proposed framework accelerates policy convergence, ensures stability under noisy feedback, and achieves strong generalization to unseen task configurations. These findings highlight the potential of demonstration-augmented reinforce- ment learning through mixed reality as a data-efficient and robust approach to robot training in real-world scenarios.

Index terms

Agent-Based Systems AI-Based Methods Human Factors and Human-in-the-Loop