SwarmNav: Swarm Robotics Navigation in Dynamic and Dense Environments Via Reinforcement Learning
Shengbo Li, Chuanjie Lv, Xiangqian Yuan, Liming Xu, Xinyang Liu, Zongzhi Zhu, Gang Xu, Yong Liu
AI summary
Problem
Traditional swarm navigation methods struggle with collision avoidance, computational bottlenecks, poor generalization across varying swarm sizes, and deadlock in symmetric scenarios when operating in dynamic, dense environments.
Approach
SwarmNav uses a deep reinforcement learning actor-critic framework that processes LiDAR observations and integrates reciprocal velocity obstacles (RVO) with a goal-region amplification reward strategy to proactively guide robots toward goals while avoiding moving obstacles.
Key results
- Hierarchical training strategy progressively builds collision avoidance skills
- Novel reward function combining RVO and goal-region amplification improves safety
- Simulations demonstrate superior success rate and computational efficiency over state-of-the-art methods
- Real-world experiments confirm robust navigation across diverse dynamic scenarios
Why it matters
Provides a scalable, real-world deployable navigation solution for large-scale swarm robotics applications operating in unpredictable, obstacle-rich environments.
Abstract
Collision avoidance and navigation in dynamic and dense environments remain highly challenging for swarm robotics. To address this, we propose SwarmNav, a novel goal- region amplification navigation policy that leverages LiDAR- based position data to generate velocity commands guiding robots toward their goals while actively avoiding obstacles. SwarmNav is trained within a deep reinforcement learning actor-critic framework. In this framework, the reward function integrates a goal-region amplification term with the reciprocal velocity obstacles formulation, enabling goal-directed naviga- tion under dynamic obstacle uncertainty. Extensive simulations demonstrate that SwarmNav significantly outperforms state- of-the-art approaches, including both reinforcement learning- based and traditional velocity obstacle-based methods, in terms of success rate and computational efficiency. Real-world exper- iments across diverse scenarios further confirm its effectiveness in dynamic and dense environments.