Event-Driven MARL for Collaborative Swarm Confrontation in Asynchronous Environments
Qizhen Wu, Lei Chen, Kexin Liu, Jinhu Lv
AI summary
Problem
Event-driven multi-agent reinforcement learning reduces decision jitter but disrupts agent coordination due to misaligned information sharing and inconsistent strategy updates across diverse timescales. This hinders effective collaboration in large-scale, dynamic swarm confrontation scenarios.
Approach
The method introduces an experience selection scheme to filter and synchronize joint information across asynchronous agents, paired with Transformer-based actor-critic networks to process complex historical confrontation data for coordinated decision-making.
Key results
- Event-driven termination eliminates decision jitter
- Experience selection synchronizes training across timescales
- Transformer QMIX captures temporal correlations for coordination
- Outperforms conventional MARL in large-scale confrontations
Why it matters
Provides a scalable coordination framework for robotic swarms operating in dynamic, asynchronous environments, advancing autonomous multi-robot systems.
Abstract
Multi-agent reinforcement learning (MARL) pro- vides a flexible solution for tackling task and motion planning challenges, particularly in swarm confrontation scenarios. By customizing termination conditions for diverse tasks, event- driven MARL reduces decision jitter caused by frequent task switching. However, it hinders robots from updating strategies on a consistent timescale, leading to misaligned information sharing that disrupts agent coordination. To address this, we propose a novel event-driven MARL approach that fa- cilitates collaborative strategy learning under asynchronous conditions. The approach introduces an experience selection scheme tailored to diverse timescales, ensuring efficient training through synchronized information sharing among robots. By incorporating Transformers, our method enables robots to infer others’ behaviors from historical data, optimizing collaborative strategies. Extensive experiments validate the effectiveness of our proposed approach.