AI summary
Problem
Classic modular approaches for robot shooting games struggle with limited observability, reliance on depth mapping or global localization, and dependencies on inter-robot communication, which limit scalability and real-world deployment.
Approach
The authors train a privileged state-based teacher policy using multi-agent reinforcement learning and distill it into a vision-based student policy that directly maps monocular images and depth heatmaps to velocity commands, enhanced by a permutation-invariant feature extractor.
Key results
- Decentralized end-to-end policy eliminates reliance on explicit state estimation, global localization, and inter-robot communication
- Achieves 16.7% higher hit accuracy and 6% improved collision avoidance compared to classic modular methods
- Permutation-invariant feature extractor and depth-heatmap inputs significantly boost robustness over standard baselines
- Successfully deployed on real-world multi-robot systems with limited onboard computational resources
Why it matters
Provides a scalable, hardware-efficient framework for decentralized multi-robot coordination that can be adapted to real-world applications like autonomous drone interception and dynamic combat scenarios.
Abstract
In this paper, we study multi-robot laser tag, a simplified yet practical shooting-game-style task. Classic modular approaches on these tasks face challenges such as limited observability and reliance on depth mapping and inter- robot communication. To overcome these issues, we present an end-to-end visuomotor policy that maps images directly to robot actions. We train a high-performing teacher policy with multi-agent reinforcement learning and distill its knowledge into a vision-based student policy. Technical designs, including a permutation-invariant feature extractor and depth–heatmap input, improve performance over standard architectures. Our policy outperforms classic methods by 16.7% in hitting accu- racy and 6% in collision avoidance, and is successfully deployed on real robots. Code will be released publicly1.