← Back ICRA 2026

Multi-Dimensional Perturbation Strategies for Adversarial Attacks in Multi-Agent Deep Reinforcement Learning

Runwen Chen, shuo feng, Tianzhe Qi, yucheng shi, Xiaorong Hu, Yang Zhao, bo sun, Zhao Jin, Yizhe Luo, Mingliang Xu

PDF

AI summary

Key figure (auto-extracted from paper)

The MREFDW-GA algorithm enables efficient, model-agnostic black-box attacks on multi-agent reinforcement learning systems by dynamically weighting perturbation dimensions and adapting evaluation metrics across evolutionary stages.

Adversarial attacks Multi-agent reinforcement learning Genetic algorithm Black-box attacks Robustness evaluation Gaussian process regression

Problem

Existing adversarial attack methods for multi-agent deep reinforcement learning (MADRL) demand high computational costs and detailed agent knowledge, often failing to address complex multi-agent dynamics like non-stationarity and collaboration. There is a critical need for efficient, robust, and black-box attack strategies to accurately assess MADRL security.

Approach

The authors reformulate adversarial attacks as an optimization problem and propose MREFDW-GA, a genetic algorithm that uses Gaussian Process Regression to compute adaptive dimension weights and applies multiple robustness evaluation functions to guide the search away from local optima.

Key results

Mitigates genetic algorithm convergence to local optima through multi-stage robustness evaluation
Improves evolutionary efficiency via adaptive multi-dimensional RBF kernel dimension weighting
Successfully executes black-box attacks that significantly degrade MADRL performance in driving scenarios
Provides a systematic framework for evaluating and enhancing MADRL adversarial robustness

Why it matters

Reveals critical vulnerabilities in safety-critical multi-agent systems and offers an efficient, resource-light methodology for researchers and engineers to stress-test and harden MADRL deployments.

Abstract

Research indicates that single-agent reinforcement learning is vulnerable to adversarial attacks, which can lead to decision-making errors. Similarly, multi-agent deep reinforce- ment learning (MADRL) systems face analogous adversarial threats. However, existing attack methods require substantial investment in agent design and computational resources, limit- ing the feasibility of such attacks. To address this issue, we reformulate adversarial attacks as an optimization problem and propose the MREFDW-GA algorithm, which integrates dimension-weighted perturbations and a multi-stage robust- ness evaluation function. This approach combines dimension- weighted perturbations with a multi-stage robustness evalua- tion function, thereby enhancing the efficiency of evolutionary algorithms while dynamically adjusting search strategies to escape local optima. Experimental results demonstrate that this method can effectively execute black-box attacks by iteratively generating adversarial perturbations, significantly degrading the performance of MADRL systems and opening new research avenues for efficient black-box attacks.

Index terms

Collision Avoidance Reinforcement Learning Motion and Path Planning