Playbook: Scalable Discrete Skill Discovery from Unstructured Datasets for Long-Horizon Decision-Making Problems
Minjae Kang, Mineui Hong, Songhwai Oh
AI summary
Problem
Current skill discovery methods lack scalability and suffer from catastrophic forgetting when learning new skills from limited or unavailable original datasets, limiting their use in general everyday tasks.
Approach
The authors propose Playbook, a method that represents skills as independent discrete plays and primitives, allowing structural expansion via class-incremental learning to continuously incorporate new tasks while preserving old ones.
Key results
- Achieves 21.4% success rate on challenging CALVIN robotic manipulation tasks
- Reaches 77.0% average success rate on combined old and new tasks after dataset extension
- Solves compounded problems with 24.4% success rate by mixing plays from different datasets
- Introduces a scalable architecture using discrete plays and primitives with class-incremental learning to prevent catastrophic forgetting
Why it matters
Enables AI agents and robots to continuously adapt to new, complex tasks over time without losing previously acquired skills, advancing practical long-horizon decision-making.
Abstract
Skill discovery methods enable agents to tackle intricate tasks by acquiring diverse and useful skills from task-agnostic datasets in an unsupervised manner. To apply these methods to more general and everyday tasks, the skill set must be scalable. However, current approaches struggle with this scalability, often facing the challenge of catastrophic forgetting when learning new skills. To address this limitation, we propose a scalable skill discovery algorithm, a playbook, which can accommodate unseen tasks by acquiring new skills while maintaining previously learned ones. The scalable structure of the playbook, consisting of finite and independent plays and primitives, enables expansion by adding new elements to accommodate new tasks. The proposed method is evaluated in the complex robotic manipulation benchmarks, and the results show that the playbook outperforms existing state-of-the-art methods. We release code for the playbook and pretrained checkpoints at https://github.com/rllab-snu/Playbook.