Beyond the Teacher: Leveraging Mixed-Skill Demonstrations for Robust Imitation Learning
Saharsh Saharsh, Shubham Sonkar, Pushpak Jagtap, Ravi Prakash
AI summary
Problem
Standard imitation learning methods require large volumes of high-quality expert demonstrations, creating a scalability bottleneck and causing policy degradation when trained on scarce, noisy, or suboptimal real-world data.
Approach
The method first identifies the best raw demonstration using a multi-objective information-aware score, extracts a canonical motion template via periodic Dynamic Movement Primitives, and then trains an LSTM to correct all demonstrations by balancing mutual information maximization with task-specific constraints.
Key results
- Recovers expert-like periodic motions from a single clean demonstration and noisy data
- Achieves robust policy learning across wiping, weaving, and pick-and-place tasks
- Introduces an adaptive demonstrator ability factor that dynamically balances information and task losses
- Outperforms standard imitation learning baselines on imperfect, mixed-skill datasets
Why it matters
Enables scalable, data-efficient robotic skill acquisition for real-world applications where collecting high-quality expert demonstrations is impractical or costly.
Abstract
Achieving expert-like robotic task execution in dynamic environments typically requires extensive, high-quality expert demonstrations, a significant bottleneck for real-world deployment. We present a novel learning framework that overcomes this data dependency, enabling robots to perform complex periodic tasks with expert-like proficiency, even when learning from naive demonstrations. Our two-stage pipeline first selects a representative demonstration based on user- defined information-aware task intention scores. This single best demo is then used to extract a canonical motion shape via Periodic Dynamic Movement Primitives (DMPs). Finally, a Long Short-Term Memory (LSTM) network refines the entire set of demonstrations, leveraging a multi-objective score that combines the canonical shape with mutual information and other task quality metrics. The proposed approach is demon- strated on a Franka Research 3 robot performing phasic tasks across three contrasting domains: wiping in human assistive services, weaving in the textile industry, and pick-and-place operations for warehouse automation. Visit project page at: https://focaslab.github.io/beyondtheteacher/.