← Back ICRA 2026

Beyond the Teacher: Leveraging Mixed-Skill Demonstrations for Robust Imitation Learning

Saharsh Saharsh, Shubham Sonkar, Pushpak Jagtap, Ravi Prakash

PDF

AI summary

Key figure (auto-extracted from paper)

A two-stage pipeline combining periodic DMPs and LSTM refinement successfully recovers expert-like robotic behaviors from a single clean demonstration and noisy mixed-skill data, drastically reducing imitation learning's data dependency.

Imitation Learning Mixed-Skill Demonstrations Periodic DMPs LSTM Refinement Data-Efficient Robotics Robotic Policy Learning

Problem

Standard imitation learning methods require large volumes of high-quality expert demonstrations, creating a scalability bottleneck and causing policy degradation when trained on scarce, noisy, or suboptimal real-world data.

Approach

The method first identifies the best raw demonstration using a multi-objective information-aware score, extracts a canonical motion template via periodic Dynamic Movement Primitives, and then trains an LSTM to correct all demonstrations by balancing mutual information maximization with task-specific constraints.

Key results

Recovers expert-like periodic motions from a single clean demonstration and noisy data
Achieves robust policy learning across wiping, weaving, and pick-and-place tasks
Introduces an adaptive demonstrator ability factor that dynamically balances information and task losses
Outperforms standard imitation learning baselines on imperfect, mixed-skill datasets

Why it matters

Enables scalable, data-efficient robotic skill acquisition for real-world applications where collecting high-quality expert demonstrations is impractical or costly.

Abstract

Achieving expert-like robotic task execution in dynamic environments typically requires extensive, high-quality expert demonstrations, a significant bottleneck for real-world deployment. We present a novel learning framework that overcomes this data dependency, enabling robots to perform complex periodic tasks with expert-like proficiency, even when learning from naive demonstrations. Our two-stage pipeline first selects a representative demonstration based on user- defined information-aware task intention scores. This single best demo is then used to extract a canonical motion shape via Periodic Dynamic Movement Primitives (DMPs). Finally, a Long Short-Term Memory (LSTM) network refines the entire set of demonstrations, leveraging a multi-objective score that combines the canonical shape with mutual information and other task quality metrics. The proposed approach is demon- strated on a Franka Research 3 robot performing phasic tasks across three contrasting domains: wiping in human assistive services, weaving in the textile industry, and pick-and-place operations for warehouse automation. Visit project page at: https://focaslab.github.io/beyondtheteacher/.

Index terms

Imitation Learning Probabilistic Inference Data Sets for Robot Learning