← Back ICRA 2026

SwarmNav: Swarm Robotics Navigation in Dynamic and Dense Environments Via Reinforcement Learning

Shengbo Li, Chuanjie Lv, Xiangqian Yuan, Liming Xu, Xinyang Liu, Zongzhi Zhu, Gang Xu, Yong Liu

PDF

AI summary

Key figure (auto-extracted from paper)

SwarmNav leverages a novel goal-region amplification strategy within a deep reinforcement learning framework to enable safe, efficient, and scalable navigation for robot swarms in dynamic, dense environments.

Swarm robotics Reinforcement learning Collision avoidance Dynamic environments Goal-region amplification RVO

Problem

Traditional swarm navigation methods struggle with collision avoidance, computational bottlenecks, poor generalization across varying swarm sizes, and deadlock in symmetric scenarios when operating in dynamic, dense environments.

Approach

SwarmNav uses a deep reinforcement learning actor-critic framework that processes LiDAR observations and integrates reciprocal velocity obstacles (RVO) with a goal-region amplification reward strategy to proactively guide robots toward goals while avoiding moving obstacles.

Key results

Hierarchical training strategy progressively builds collision avoidance skills
Novel reward function combining RVO and goal-region amplification improves safety
Simulations demonstrate superior success rate and computational efficiency over state-of-the-art methods
Real-world experiments confirm robust navigation across diverse dynamic scenarios

Why it matters

Provides a scalable, real-world deployable navigation solution for large-scale swarm robotics applications operating in unpredictable, obstacle-rich environments.

Abstract

Collision avoidance and navigation in dynamic and dense environments remain highly challenging for swarm robotics. To address this, we propose SwarmNav, a novel goal- region amplification navigation policy that leverages LiDAR- based position data to generate velocity commands guiding robots toward their goals while actively avoiding obstacles. SwarmNav is trained within a deep reinforcement learning actor-critic framework. In this framework, the reward function integrates a goal-region amplification term with the reciprocal velocity obstacles formulation, enabling goal-directed naviga- tion under dynamic obstacle uncertainty. Extensive simulations demonstrate that SwarmNav significantly outperforms state- of-the-art approaches, including both reinforcement learning- based and traditional velocity obstacle-based methods, in terms of success rate and computational efficiency. Real-world exper- iments across diverse scenarios further confirm its effectiveness in dynamic and dense environments.

Index terms

Swarm Robotics Collision Avoidance Reinforcement Learning