When the Adversary Knows You Better: Adversarial Training for Learning-Based Legged Robots
Qinchao Xu, Satoshi Yagi, Satoshi Yamamori, Jun Morimoto
AI summary
Problem
Existing adversarial training methods for legged robots restrict attackers to the same observations as the controller, limiting vulnerability discovery and often forcing fine-tuned policies into overly cautious behavior.
Approach
We equip the adversarial agent with privileged state data and curiosity-driven exploration to generate stronger attacks, then fine-tune the locomotion controller using a probabilistic termination rule to maintain agility.
Key results
- Curiosity-driven reward eliminates hand-crafted auxiliary incentives
- Privileged information more than doubles adversarial attack success rate
- Stochastic termination prevents overly conservative fine-tuned behavior
- Robustness gains successfully transfer to real-world quadruped deployments
Why it matters
Enables safer deployment of learning-based legged robots in unstructured environments by systematically exposing and hardening against realistic control vulnerabilities.
Abstract
Deep reinforcement learning has emerged as the dominant paradigm for training legged robots to locomote, however, when deployed in unstructured, dynamically varying real-world environments, the safety of neural network based controllers remains insufficiently guaranteed. Prior studies have demonstrated that sequential adversarial attacks, formulated via reinforcement learning, can effectively expose latent vulner- abilities in controllers and thus serve as a valuable complement to Domain Randomization techniques. These methods, however, are inherently constrained by the assumption that both the adversary and the locomotion policy share identical state space inputs. In contrast, our approach overcomes this limitation by incorporating privileged information into the adversarial network’s observation input, thereby more than doubling the attack success rate. Furthermore, we mitigate the controller’s tendency toward overly conservative behavior under attacks by introducing stochastic termination criteria. We validate the proposed method in real-world deployments, showing that it not only significantly enhances robustness but also preserves original task performance.