Mastering Scene Rearrangement with Expert-Assisted Curriculum Learning and Adaptive Trade-Off Tree-Search
Zan Wang, Hanqing Wang, Wei Liang
Abstract
Scene Rearrangement Planning (SRP) has recently emerged as a crucial interior scene task; however, current approaches still face two primary issues. First, prior works define the action space of SRP using handcrafted coarse- grained actions, which are inflexible for scene arrangement transition and impractical for real-world deployment. Secondly, the scarcity of realistic indoor scene rearrangement data hin- ders popular data-hungry learning approaches and quantitative evaluation. To tackle these issues, we propose a fine-grained action space definition and curate a large-scale scene rearrange- ment dataset to facilitate the training of learning approaches and comprehensive benchmarking. Building upon this dataset, we introduce a novel framework, PLATO, designed for efficient agent training and inference. Our approach features an exPert- assisted curriculum Learning (PL) paradigm that possesses a Behavior Cloning (BC) and an offline Reinforcement Learning (RL) curriculum for agent training, along with an advanced tree-search-based planner enhanced by an Adaptive Trade-Off (ATO) strategy to improve expert agent performance further. We demonstrate the superior performance of our method over baseline agents through extensive experiments and provide a detailed analysis to elucidate its rationale. Our project website can be accessed at pl-ato.github.io.