Bi-CL: A Reinforcement Learning Framework for Robots Coordination through Bi-Level Optimization
Zechen Hu, Daigo Shishika, Xuesu Xiao, Xuan Wang
Abstract
In multi-robot systems, achieving coordinated mis- sions remains a significant challenge due to the coupled nature of coordination behaviors and the lack of global information for individual robots. To mitigate these challenges, this paper introduces a novel approach, Bi-level Coordination Learning (Bi-CL), that leverages a bi-level optimization structure within a CTDE paradigm. Our bi-level reformulation decomposes the original problem into a reinforcement learning level with reduced action space, and an imitation learning level that gains demonstrations from a global optimizer. Bi-CL further integrates an alignment penalty mechanism, aiming to minimize the discrepancy between the two levels without degrading their training efficiency. We introduce a running example to conceptualize the problem formulation. Simulation results demonstrate that Bi-CL can learn more efficiently and achieve comparable performance with traditional multi-agent reinforce- ment learning baselines for multi-robot coordination.