Research Analyzer
← Back ICRA 2026

Multi-Dimensional Perturbation Strategies for Adversarial Attacks in Multi-Agent Deep Reinforcement Learning

Runwen Chen, shuo feng, Tianzhe Qi, yucheng shi, Xiaorong Hu, Yang Zhao, bo sun, Zhao Jin, Yizhe Luo, Mingliang Xu

PDF

AI summary

Key figure (auto-extracted from paper)
The MREFDW-GA algorithm enables efficient, model-agnostic black-box attacks on multi-agent reinforcement learning systems by dynamically weighting perturbation dimensions and adapting evaluation metrics across evolutionary stages.
Adversarial attacks Multi-agent reinforcement learning Genetic algorithm Black-box attacks Robustness evaluation Gaussian process regression

Problem

Existing adversarial attack methods for multi-agent deep reinforcement learning (MADRL) demand high computational costs and detailed agent knowledge, often failing to address complex multi-agent dynamics like non-stationarity and collaboration. There is a critical need for efficient, robust, and black-box attack strategies to accurately assess MADRL security.

Approach

The authors reformulate adversarial attacks as an optimization problem and propose MREFDW-GA, a genetic algorithm that uses Gaussian Process Regression to compute adaptive dimension weights and applies multiple robustness evaluation functions to guide the search away from local optima.

Key results

  • Mitigates genetic algorithm convergence to local optima through multi-stage robustness evaluation
  • Improves evolutionary efficiency via adaptive multi-dimensional RBF kernel dimension weighting
  • Successfully executes black-box attacks that significantly degrade MADRL performance in driving scenarios
  • Provides a systematic framework for evaluating and enhancing MADRL adversarial robustness

Why it matters

Reveals critical vulnerabilities in safety-critical multi-agent systems and offers an efficient, resource-light methodology for researchers and engineers to stress-test and harden MADRL deployments.

Abstract

Research indicates that single-agent reinforcement learning is vulnerable to adversarial attacks, which can lead to decision-making errors. Similarly, multi-agent deep reinforce- ment learning (MADRL) systems face analogous adversarial threats. However, existing attack methods require substantial investment in agent design and computational resources, limit- ing the feasibility of such attacks. To address this issue, we reformulate adversarial attacks as an optimization problem and propose the MREFDW-GA algorithm, which integrates dimension-weighted perturbations and a multi-stage robust- ness evaluation function. This approach combines dimension- weighted perturbations with a multi-stage robustness evalua- tion function, thereby enhancing the efficiency of evolutionary algorithms while dynamically adjusting search strategies to escape local optima. Experimental results demonstrate that this method can effectively execute black-box attacks by iteratively generating adversarial perturbations, significantly degrading the performance of MADRL systems and opening new research avenues for efficient black-box attacks.

Index terms

Collision Avoidance Reinforcement Learning Motion and Path Planning

Related papers