← Back ICRA 2026

Cooperative-Competitive Team Play of Real-World Craft Robots

Rui Zhao, Xihui Li, Yizheng Zhang, Yuzhen Liu, Zhong Zhang, Yufeng Zhang, Cheng Zhou, Zhengyou Zhang, Lei Han

PDF

AI summary

Key figure (auto-extracted from paper)

Introducing Out of Distribution State Initialization (OODSI) bridges the sim-to-real gap, enabling multi-agent reinforcement learning to successfully train cooperative and competitive team strategies on real-world robots.

Multi-agent reinforcement learning Sim-to-real transfer Cooperative robotics Competitive team play Out of distribution initialization Real-world robotics

Problem

Multi-agent reinforcement learning struggles with efficient training and sim-to-real transfer for physical robots due to asynchronous action execution and environmental discrepancies that cause simulation-trained policies to fail upon deployment.

Approach

The authors develop a complete robotic platform with discrete and continuous simulations, and propose OODSI to inject out-of-distribution states from testing environments into the training start-state distribution, alongside guided RL with action masking to accelerate learning.

Key results

Complete multi-robot platform with pyBullet and Gazebo simulations
OODSI method to inject out-of-distribution states into training
20% improvement in Sim2Real performance
Real-world deployment of cooperative and competitive team strategies

Why it matters

Enables scalable, data-driven multi-agent coordination for real-world robotics applications without relying on costly real-world data collection or manual control design.

Abstract

Multi-agent deep Reinforcement Learning (RL) has made significant progress in developing intelligent game- playing agents in recent years. However, the efficient training of collective robots using multi-agent RL and the transfer of learned policies to real-world applications remain open research questions. In this work, we first develop a comprehen- sive robotic system, including simulation, distributed learning framework, and physical robot components. We then propose and evaluate reinforcement learning techniques designed for efficient training of cooperative and competitive policies on this platform. To address the challenges of multi-agent sim-to-real transfer, we introduce Out of Distribution State Initialization (OODSI) to mitigate the impact of the sim-to-real gap. In the experiments, OODSI improves the Sim2Real performance by 20%. We demonstrate the effectiveness of our approach through experiments with a multi-robot car competitive game and a cooperative task in real-world settings.

Index terms

Multi-Robot Systems Cooperating Robots Reinforcement Learning