FASIONAD: Adaptive Uncertainty-Gated Fast�Slow Fusion Framework for Safe Autonomous Driving
Ziang Luo, Sicong Jiang, Kangan Qian, Zilin Huang, Tianze Zhu, Siwen Jiao, Jinyu Miao, Zheng Fu, Yang Zhong, Yunlong Wang, Hao YE, Mengmeng Yang, Kun Jiang, Diange Yang
AI summary
Problem
Fixed-interval fast-slow architectures waste compute and latency during routine driving, while purely end-to-end planners lack reliability in long-tail scenarios.
Approach
FASIONAD uses a Laplace-based uncertainty gate to trigger a slow vision-language model only when the fast planner's confidence drops, feeding back concise planning states and high-level plans via an information bottleneck and cross-attention.
Key results
- Lowers average trajectory error by 6.7% and collision rate by 28.1% versus strong E2E baselines
- Reduces slow-module calls by over 60%, cutting computational overhead
- Achieves state-of-the-art performance across nuScenes, Bench2Drive, and CARLA Town05 benchmarks
- Introduces visual and BEV prompts with reward-guided tuning to align VLM spatial reasoning and curb hallucinations
Why it matters
Offers a practical, efficient architecture for deploying safer autonomous vehicles that dynamically balance real-time control with complex reasoning.
Abstract
Previous fast–slow system architectures demon- strated that pairing a reactive E2E planner with a deliberative vision-language model (VLM) can address these long-tail scenarios. However, these dual-system models that query the slow module at fixed intervals are computationally inefficient and introduce unnecessary latency during normal operation. To bridge this gap, we introduce FASIONAD, an adaptive fast–slow framework for autonomous driving that selectively integrates E2E planning and VLM reasoning. A lightweight fast planner manages general control, while a slow reasoner is activated only when a Laplace-based uncertainty gate detects changed uncertainty. Rather than overriding control, the VLM provides concise planning states and high-level plans. These inform the planner through an information bottleneck and high-level action guidance, enhancing interpretability and safety. Evaluated on the nuScenes, Bench2Drive, and CARLA Town05 closed-loop benchmarks, FASIONAD lowers the average trajectory error by 6.7% and the collision rate by 28.1% compared with strong E2E baselines, while also markedly reducing computational overhead relative to always-on fast–slow dual systems. These results demonstrate that adaptive fast–slow fusion is a practical route to safer, more reliable, and more efficient autonomous driving.