Research Analyzer
← Back ICRA 2026

ARMOR: Attack-Resilient Reinforcement Learning Control for UAVs

Pritam Dash, Ethan Chan, Nathan P. Lawrence, Karthik Pattabiraman

PDF

AI summary

Key figure (auto-extracted from paper)
ARMOR enables UAVs to safely navigate under physical sensor attacks by learning robust latent state representations through a two-stage training framework that eliminates the need for iterative adversarial training.
UAV control reinforcement learning attack resilience sensor spoofing latent representation safe robotics

Problem

Physical sensor attacks like GPS spoofing corrupt UAV state estimates, causing unsafe behavior that conventional safe reinforcement learning methods cannot mitigate. Existing adversarial training approaches are computationally expensive, fail to generalize to unseen attacks, and lack zero-shot effectiveness.

Approach

ARMOR uses a two-stage training framework where a teacher encoder learns robust latent state representations using privileged attack information, followed by a student encoder that approximates these representations using only historical onboard sensor data for real-world deployment.

Key results

  • Outperforms conventional safe RL methods under physical sensor attacks
  • Achieves zero-shot generalization to unseen attack modalities without retraining
  • Eliminates iterative adversarial training, significantly reducing computational costs
  • Successfully approximates privileged latent states using only historical sensor data

Why it matters

It provides a practical, deployment-ready solution for securing autonomous UAVs against real-world sensor spoofing, directly benefiting logistics, surveillance, and emergency response operations.

Abstract

Unmanned Aerial Vehicles (UAVs) depend on on- board sensors for perception, navigation, and control. However, these sensors are susceptible to physical attacks, such as GPS spoofing, that can corrupt state estimates and lead to unsafe behavior. While reinforcement learning (RL) offers adaptive control capabilities, existing safe RL methods are ineffective against such attacks. We present ARMOR (Adaptive Robust Manipulation-Optimized State Representations), an attack- resilient, model-free RL controller that enables robust UAV operation under adversarial sensor manipulation. Instead of relying on raw sensor observations, ARMOR learns a robust latent representation of the UAV’s physical state via a two- stage training framework. In the first stage, a teacher encoder, trained with privileged attack information, generates attack- aware latent states for RL policy training. In the second stage, a student encoder is trained via supervised learning to approximate the teacher’s latent states using only historical sensor data, enabling real-world deployment without privileged information. Our experiments show that ARMOR outperforms conventional methods for ensuring UAV safety. Further, ARMOR improves generalization to unseen attacks and reduces training cost by eliminating the need for iterative adversarial training.

Index terms

Aerial Systems: Applications Robot Safety Transfer Learning

Related papers