EMPERROR: A Flexible Generative Perception Error Model for Probing Self-Driving Planners
Niklas Hanselmann, Simon Doll, Marius Cordts, Hendrik Peter Asmus Lensch, Andreas Geiger
AI summary
Problem
Most self-driving planner evaluations assume perfect perception or rely on simple error models that fail to capture complex, scene-consistent failure modes of modern 3D object detectors, leaving planner robustness underexplored.
Approach
The authors propose EMPERROR, a transformer-based generative model that learns to imitate the error distribution of a target 3D object detector given a scene context, enabling the generation of realistic, adversarial noisy detections to stress-test planners.
Key results
- Transformer-based generative perception error model
- Adversarial probing framework for planner stress-testing
- Up to 85% increase in planner collision rates under plausible noise
- Demonstration of learned planning vulnerability to minor detection errors
Why it matters
It provides a critical tool for evaluating the real-world robustness of autonomous driving planners, highlighting that current systems are significantly more brittle to perception noise than previously recognized.
Abstract
To handle the complexities of real-world traffic, learning planners for self-driving from data is a promising direction. While recent approaches have shown great progress, they typically assume a setting in which the ground-truth world state is available as input. However, when deployed, planning needs to be robust to the long-tail of errors incurred by a noisy perception system, which is often neglected in evaluation. To address this, previous work has proposed drawing adversarial samples from a perception error model (PEM) mimicking the noise characteristics of a target object detector. However, these methods use simple PEMs that fail to accurately capture all failure modes of detection. In this paper, we present EMPERROR, a novel transformer-based generative PEM, apply it to stress- test an imitation learning (IL)-based planner and show that it imitates modern detectors more faithfully than previous work. Furthermore, it is able to produce realistic noisy inputs that increase the planner’s collision rate by up to 85 %, demonstrating its utility as a valuable tool for a more complete evaluation of self-driving planners.