← Back ICRA 2026

World Model Failure Classification and Anomaly Detection for Autonomous Inspection

Michelle Ho, Muhammad Fadhil Ginting, Isaac Ronald Ward, Andrzej Reinke, Mykel J. Kochenderfer, Ali-akbar Agha-mohammadi, Shayegan Omidshafiei

PDF

AI summary

Key figure (auto-extracted from paper)

A hybrid world model and conformal prediction framework classifies autonomous inspection outcomes as successes, known failures, or anomalies with over 90% accuracy and earlier detection than humans.

Autonomous inspection World models Anomaly detection Conformal prediction Failure classification Robotics

Problem

Autonomous inspection robots struggle to maintain accurate readings under occlusions, limited viewpoints, and unexpected environmental conditions, making it difficult to distinguish between successful tasks, known failures, and novel anomalies.

Approach

The method uses a compressed-video world model to predict future frames and applies conformal prediction thresholds to classify trajectories as successes, known failures, or out-of-distribution cases in a policy-agnostic manner.

Key results

Over 90% accuracy distinguishing successes, known failures, and OOD cases
Anticipatory detection occurring earlier than human observers
Real-time online deployment on a Boston Dynamics Spot robot
Mahalanobis distance metric achieved 100% OOD detection accuracy

Why it matters

Enables robust, anticipatory failure detection for autonomous inspection robots, improving operational safety and providing a feedback signal for training data assessment.

Abstract

Autonomous inspection robots for monitoring in- dustrial sites can reduce costs and risks associated with human- led inspection. However, accurate readings can be challenging due to occlusions, limited viewpoints, or unexpected environ- mental conditions. We propose a hybrid framework that com- bines supervised failure classification with anomaly detection, enabling classification of inspection tasks as a success, known failure, or anomaly (i.e., out-of-distribution) case. Our approach uses a world model backbone with compressed video inputs. This policy-agnostic, distribution-free framework determines classifications based on two decision functions set by conformal prediction (CP) thresholds before a human observer does. We evaluate the framework on gauge inspection feeds collected from office and industrial sites and demonstrate real-time deployment on a Boston Dynamics Spot. Experiments show over 90% accuracy in distinguishing between successes, failures, and OOD cases, with classifications occurring earlier than a human observer. These results highlight the potential for robust, anticipatory failure detection in autonomous inspection tasks or as a feedback signal for model training to assess and improve the quality of training data. Project website: https: //autoinspection-classification.github.io/

Index terms

Failure Detection and Recovery Field Robots