Multi-Level Progressive Reinforcement Learning for Control Policy in Physical Simulations
Kefei Wu, Xuming He, Yang Wang, Xiaopei Liu
Abstract
Training model-free intelligent agents in complex real-world scenarios using reinforcement learning (RL) often necessitates simulation-based environments due to high physical expenses. However, when simulation takes a long time, e.g., in an unsteady 3D fluid simulation with interactions to the controllable solids, existing RL algorithms meet difficulty to accomplish training within a reasonable timeframes. In this paper, we propose a novel multi-level framework for RL to accelerate convergence as the first attempt to address this difficulty. Motivated by the idea of multi-grid solver, the control policy on a virtual agent over time can be decomposed into different frequency levels, which can be progressively learned via a set of simulations in a coarse-to-fine manner. It is expected that most RL trials are performed in coarser simulations to learn lower control frequency levels with more efficient convergence, while higher frequency levels require much less RL trials, thus significantly accelerating the learning process. To implement our idea, we designed a novel multi-level residual network with a filter module attached, where each level of the network is learned by performing RL for a given simulation resolution. The proposed framework is evaluated by conducting policy learning experiments on a virtual aerial (2D) and an underwater (3D) robot, both requiring time-consuming physical simulations. Our results demonstrate a decrease in almost half in learning time compared to a direct RL approach, while achieving similar control performance.