Reliable Robotic Task Execution in the Face of Anomalies
Bharath Santhanam, Alex Mitrevski, Santosh Thoduka, Sebastian Houben, Teena Hassan
AI summary
Problem
Learned robot policies lack built-in mechanisms to recognize and recover from execution failures caused by open-environment complexities and domain shifts, making them prone to unsafe behavior during deployment.
Approach
The framework monitors real-time visual observations using a self-supervised anomaly detector trained on nominal task data, triggering a three-stage recovery process (pausing, local perturbation, and state reset) when deviations occur.
Key results
- Increased execution success rates across two robotic tasks under trajectory deviations and adversarial interventions.
- Validated generalization across different robot platforms and policy types, including a general-purpose foundation model.
- Delivered a modular, policy-agnostic recovery pipeline that decouples monitoring from execution without requiring expert recovery policies.
- Released an open-source implementation of the detection and recovery framework.
Why it matters
Enables safer, more robust deployment of learning-based robot policies in real-world settings by providing a lightweight, modular solution for failure detection and recovery.
Abstract
Learned robot policies have consistently been shown to be versatile, but they typically have no built-in mechanism for handling the complexity of open environments, making them prone to execution failures; this implies that deploying policies without the ability to recognise and react to failures may lead to unreliable and unsafe robot behaviour. In this letter, we present a framework that couples a learned policy with a method to detect visual anomalies during policy deployment and to perform recovery behaviours when necessary, thereby aiming to prevent failures. Specifically, we train an anomaly detection model using data collected during nominal executions of a trained policy. This model is then integrated into the online policy execution process, so that deviations from the nominal execution can trigger a three- level sequential recovery process that consists of (i) pausing the execution temporarily, (ii) performing a local perturbation of the robot’s state, and (iii) resetting the robot to a safe state by sampling from a learned execution success model. We verify our proposed method in two different scenarios: (i) a door handle reaching task with a Kinova Gen3 arm using a policy trained in simulation and transferred to the real robot, and (ii) an object placing task with a UFactory xArm 6 using a general-purpose policy model. Our results show that integrating policy execution with anomaly detection and recovery increases the execution success rate in environments with various anomalies, such as trajectory deviations and adversarial human interventions.