← Back IROS 2024

D-MARL: A Dynamic Communication-Based Action Space Enhancement for Multi Agent Reinforcement Learning Exploration of Large Scale Unknown Environments

Gabriele Calzolari, Vidya Sumathy, Christoforos Kanellakis, George Nikolakopoulos

PDF

Abstract

In this article, we propose a novel communication- based action space enhancement for the D-MARL exploration algorithm to improve the efficiency of mapping an unknown en- vironment, represented by an occupancy grid map. In general, communication between autonomous systems is crucial when exploring large and unstructured environments. In such real- world scenarios, data transmission is limited and relies heavily on inter-agent proximity and the attributes of the autonomous platforms. In the proposed approach, each agent’s policy is optimized by utilizing the heterogeneous-agent proximal policy optimization algorithm to autonomously choose whether to communicate or explore the environment. To accomplish this, multiple novel reward functions are formulated by integrating inter-agent communication and exploration. The investigated approach aims to increase efficiency and robustness in the mapping process, minimize exploration overlap, and prevent agent collisions. The D-MARL policies trained on different reward functions have been compared to understand the effect of different reward terms on the collaborative attitude of the homogeneous agents. Finally, multiple simulation results are provided to prove the efficacy of the proposed scheme.

Index terms

Multi-Robot Systems Reinforcement Learning Cooperating Robots