Learning Load-Balanced Distributed Coverage for Robot Swarms Via Graph Attention Networks
∗, Wenzong Ma , Hui Xiong , and Yiding Ji
AI summary
Problem
Existing distributed coverage control methods for robot swarms struggle with workload imbalance, limited communication ranges, and reliance on global target information, making them unscalable and impractical for real-world deployment.
Approach
The authors combine a model-based centroidal Voronoi tessellation controller with a graph attention network in an actor-critic architecture, using locally computed CVT variables as node features to learn decentralized coordination policies through multi-hop neighbor interactions.
Key results
- Distributed CVT controller with explicit load density feedback for balanced resource consumption.
- GAT-based decentralized policy learning that adapts to dynamic target distributions without global broadcast.
- Actor-critic training scheme jointly optimizing coverage efficiency and network connectivity.
- Superior simulation performance over Lloyd and GNN baselines, with successful sim-to-real transfer validated in real-world experiments.
Why it matters
Enables scalable, resource-aware swarm coordination for real-world applications like disaster response and environmental monitoring where communication and battery life are constrained.
Abstract
Coverage control for dynamic targets remains challenging in multi-robot systems due to limited communica- tion, workload imbalance, and the lack of scalable decentralized strategies. In this paper, we propose a hybrid model-based and learning-driven framework that enables distributed coverage with load balancing under local communication constraints. We first derive a centroidal Voronoi tessellation (CVT)-based controller that explicitly incorporates load density regulation to balance resource consumption among robots. To eliminate the reliance on global target information, we embed key control variables as node features and employ a graph attention network (GAT) to learn decentralized coordination policies through multi-hop neighbor interactions. An actor-critic train- ing scheme further refines the policy to maximize coverage performance while preserving network connectivity. Simula- tions demonstrate superior coverage efficiency and connectivity maintenance compared with Lloyd and GNN-based baselines, together with strong generalization across varying sensing ranges and swarm sizes. Real-world experiments validate the sim-to-real transferability of the proposed framework.