SPRINT: Scalable Policy Pre-Training Via Language Instruction Relabeling
Jesse Zhang, Karl Pertsch, Jiahui Zhang, Joseph Lim
Abstract
Pre-training robots with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automati- cally expand a base set of pre-training tasks: instruction relabel- ing via large language models and cross-trajectory skill chaining with offline reinforcement learning. As a result, SPRINT pre- training equips robots with a richer repertoire of skills that can help an agent generalize to new tasks. Experiments in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks than previous pre-training approaches. Website at https://clvrai.com/sprint.