World Model Failure Classification and Anomaly Detection for Autonomous Inspection
Michelle Ho, Muhammad Fadhil Ginting, Isaac Ronald Ward, Andrzej Reinke, Mykel J. Kochenderfer, Ali-akbar Agha-mohammadi, Shayegan Omidshafiei
AI summary
Problem
Autonomous inspection robots struggle to maintain accurate readings under occlusions, limited viewpoints, and unexpected environmental conditions, making it difficult to distinguish between successful tasks, known failures, and novel anomalies.
Approach
The method uses a compressed-video world model to predict future frames and applies conformal prediction thresholds to classify trajectories as successes, known failures, or out-of-distribution cases in a policy-agnostic manner.
Key results
- Over 90% accuracy distinguishing successes, known failures, and OOD cases
- Anticipatory detection occurring earlier than human observers
- Real-time online deployment on a Boston Dynamics Spot robot
- Mahalanobis distance metric achieved 100% OOD detection accuracy
Why it matters
Enables robust, anticipatory failure detection for autonomous inspection robots, improving operational safety and providing a feedback signal for training data assessment.
Abstract
Autonomous inspection robots for monitoring in- dustrial sites can reduce costs and risks associated with human- led inspection. However, accurate readings can be challenging due to occlusions, limited viewpoints, or unexpected environ- mental conditions. We propose a hybrid framework that com- bines supervised failure classification with anomaly detection, enabling classification of inspection tasks as a success, known failure, or anomaly (i.e., out-of-distribution) case. Our approach uses a world model backbone with compressed video inputs. This policy-agnostic, distribution-free framework determines classifications based on two decision functions set by conformal prediction (CP) thresholds before a human observer does. We evaluate the framework on gauge inspection feeds collected from office and industrial sites and demonstrate real-time deployment on a Boston Dynamics Spot. Experiments show over 90% accuracy in distinguishing between successes, failures, and OOD cases, with classifications occurring earlier than a human observer. These results highlight the potential for robust, anticipatory failure detection in autonomous inspection tasks or as a feedback signal for model training to assess and improve the quality of training data. Project website: https: //autoinspection-classification.github.io/