SymSkill: Symbol and Skill Co-Invention for Data-Efficient and Reactive Long-Horizon Manipulation
Yifei Shao, Yuchen Zheng, Sunan Sun, Pratik Chaudhari, Vijay Kumar, Nadia Figueroa
AI summary
Problem
Imitation learning lacks compositional generalization for multi-step tasks, while classical task-and-motion planning suffers from high latency that prevents real-time failure recovery in dynamic environments.
Approach
SymSkill jointly learns relative-pose predicates, symbolic operators, and goal-oriented dynamical system skills from unsegmented demonstrations offline, then uses a symbolic planner online to compose skills and recover from failures in real time.
Key results
- 85% success rate on 12 single-step tasks in RoboCasa simulation
- Composes skills into multi-step plans without additional data
- Learns 11 operators from 5 minutes of real-world play data on a Franka robot
- Enables real-time symbolic and motion-level failure recovery with sub-100ms planning latency
Why it matters
Bridges the gap between imitation learning and classical planning by enabling scalable, real-time robotic manipulation without hand-engineered symbols or massive datasets.
Abstract
Multi-step manipulation in dynamic environments remains challenging. Imitation learning (IL) is reactive but lacks compositional generalization, since monolithic policies do not decide which skill to reuse when scenes change. Classical task- and-motion planning (TAMP) offers compositionality, but its high planning latency prevents real-time failure recovery. We introduce SymSkill, a unified framework that jointly learns predicates, operators, and skills from unlabeled, unsegmented demonstrations, combining compositional generalization with real-time recovery. Offline, SymSkill learns symbolic ab- stractions and goal-oriented skills directly from demonstra- tions. Online, given a conjunction of learned predicates, it uses a symbolic planner to compose and reorder skills to achieve symbolic goals while recovering from failures at both the motion and symbolic levels in real time. Coupled with a compliant controller, SymSkill supports safe execution under human and environmental disturbances. In RoboCasa simulation, SymSkill executes 12 single-step tasks with 85% success and composes them into multi-step plans without additional data. On a real Franka robot, it learns from 5 minutes of play data and performs 12-step tasks from goal specifications. Code and additional analysis are available at https://sites.google.com/view/symskill.