← Back ICRA 2026

Latent Activation Editing: Inference-Time Refinement of Learned Policies for Safer Multirobot Navigation

Satyajeet Das, Darren Chiu, Zhehui Huang, Lars Lindemann, Gaurav Sukhatme

PDF

AI summary

Key figure (auto-extracted from paper)

LAE safely steers pre-trained multi-quadrotor policies at inference time, cutting collisions by nearly 90% without retraining or weight changes.

Latent Activation Editing Inference-Time Refinement Multi-Robot Navigation Reinforcement Learning Safety Quadrotor

Problem

Pre-trained reinforcement learning policies for multi-robot navigation remain vulnerable to rare but critical collisions in cluttered environments, and retraining to fix them is costly, risks catastrophic forgetting, and yields diminishing returns.

Approach

The framework monitors a frozen policy's intermediate latent activations with an online classifier, and replaces flagged unsafe activations with risk-amplified surrogates generated by a latent collision world model, steering behavior without modifying weights.

Key results

Nearly 90% reduction in cumulative collisions compared to the baseline
Substantially increased fraction of collision-free trajectories while preserving goal completion
Demonstrated real-world feasibility on resource-constrained Crazyflie quadrotors
Established as a lightweight, post-deployment refinement paradigm for learned robot policies

Why it matters

It offers a practical, model-free method for enhancing the safety of deployed multi-robot systems without costly retraining or architectural changes.

Abstract

Reinforcement learning has enabled significant progress in complex domains such as coordinating and navi- gating multiple quadrotors. However, even well-trained policies remain vulnerable to collisions in obstacle-rich environments. Addressing these infrequent but critical safety failures through retraining or fine-tuning is costly and risks degrading previously learned skills. Inspired by activation steering in large language models and latent editing in computer vision, we introduce a framework for inference-time Latent Activation Editing (LAE) that refines the behavior of pre-trained policies without modi- fying their weights or architecture. The framework operates in two stages: (i) an online classifier monitors intermediate activations to detect states associated with undesired behaviors, and (ii) an activation editing module that selectively modifies flagged activations to shift the policy towards safer regimes. In this work, we focus on improving safety in multi-quadrotor navigation. We hypothesize that amplifying a policy’s internal perception of risk can induce safer behaviors. We instantiate this idea through a latent collision world model trained to pre- dict future pre-collision activations, thereby prompting earlier and more cautious avoidance responses. Extensive simulations and real-world Crazyflie experiments demonstrate that LAE achieves statistically significant reduction in collisions (nearly 90% fewer cumulative collisions compared to the unedited base- line) and substantially increases the fraction of collision-free trajectories, while preserving task completion. More broadly, our results establish LAE as a lightweight paradigm, feasible on resource-constrained hardware, for post-deployment refinement of learned robot policies. Our project page with videos and code is available at https://lae-robotics.github.io/.

Index terms

Multi-Robot Systems Robot Safety Machine Learning for Robot Control