← Back ICRA 2026

STAGE: Structure-Adaptive Graph-Encoded Multi-Agent Policy Gradient for Moving Target Search in Uncertain Topological Networks

Qihang Peng, Lizhou Zhu, Lekai Chen, Hongliang Guo, Chih-yung Wen

PDF

AI summary

Key figure (auto-extracted from paper)

STAGE significantly reduces expected capture time in multi-robot search by dynamically adapting to uncertain, changing network topologies using a bi-scale graph attention encoder and entropy-regularized counterfactual policy gradient.

Multi-robot search uncertain topology graph attention network multi-agent reinforcement learning counterfactual policy gradient structure-adaptive planning

Problem

Existing multi-robot efficient search algorithms assume fixed network topologies, which fails in real-world scenarios where edges can become blocked or revealed, forcing a trade-off between discarding prior map knowledge or suffering from outdated structural assumptions.

Approach

The proposed STAGE algorithm uses a bi-scale graph attention network to capture both local and long-range topological changes, combined with an entropy-regularized counterfactual policy gradient to train decentralized multi-robot policies that adapt to uncertain environments.

Key results

Introduces STAGE, a novel MARL algorithm explicitly designed for MuRES under uncertain topologies.
Proposes a distance-augmented long-range GAT that captures global structural changes while mitigating over-smoothing.
Integrates entropy regularization into counterfactual policy gradients to stabilize learning and enhance exploration.
Demonstrates superior performance and feasibility through extensive simulations and physical experiments compared to state-of-the-art baselines.

Why it matters

Enables reliable and efficient multi-robot search in dynamic real-world environments like disaster response, where infrastructure damage or uncertainty is common.

Abstract

This paper investigates the multi-robot efficient search (MuRES) problem in uncertain topological networks. One unique characteristic of the studied problem is that the topology of the underlying network is uncertain, posing great challenges to canonical MuRES solutions which presumes a fixed network topology. To address the challenge, this paper proposes the STructure-Adaptive Graph-Encoded policy gradi- ent (STAGE) algorithm for moving target search. STAGE com- prises two main components: (1) the bi-scale graph attention network (GAT) encoder, which fuses a k-hop local GAT with a distance-augmented long-range GAT to enable the encoder to capture both local and long-range network structural changes; and (2) the entropy-regularized counterfactual policy gradient module, which employs a structure-aware centralized critic to estimate both the team returns and the network structure information, and train the decentralized actors via counter- factual marginalization with entropy regularization. Extensive simulation results and physical experiment demonstrate the feasibility and superiority of STAGE for solving MuRES in uncertain topological environments.

Index terms

Multi-Robot Systems Path Planning for Multiple Mobile Robots or Agents Reinforcement Learning