D2MFusion: An End-To-End Differentiable Trajectory Optimizer for Safe Reactive Navigation
Xiangyu Zhou, Shenghong Zhang, Xiao Li
AI summary
Problem
Data-driven planners lack safety guarantees and require massive datasets, while traditional model-based planners struggle to adapt to dynamic environments. Existing hybrid approaches often lack deep integration, interpretability, and real-world validation.
Approach
D2MFusion extracts environmental features from BEV images using a CNN, refines them with an LSTM, and dynamically adjusts the parameters of a differentiable Linear Quadratic Regulator (dLQR) to generate trajectories in an end-to-end trainable loop.
Key results
- Introduces D2MFusion, an end-to-end architecture fusing neural networks with dLQR for safe reactive navigation
- Demonstrates planner explainability by visualizing how learned feature vectors adjust risk and tracking weights
- Achieves high data efficiency and safe reactivity through imitation learning on the nuPlan dataset
- Validates real-world dynamic navigation effectiveness on a Unitree Go2 robot across multiple scenarios
Why it matters
Provides a transparent, data-efficient planning framework that bridges the gap between learning-based adaptability and model-based safety for autonomous vehicles and robots.
Abstract
Data-driven methods provide effective solutions for robot trajectory generation in dynamic environments. Many physical constraints exist in the real world, and understanding these constraints to generate feasible trajectories for kinematics or dynamics is highly demanding regarding the data quantity. Due to the black box, it is also challenging to ensure the safety of the trajectories planned by data-driven models. In this paper, we propose an end-to-end model (D2MFusion) that fuses data-driven components and a model-based optimizer. D2MFusion uses a differentiable optimization layer (dLQR) that forms a backpropagation loop with a perception network. With the input BEV image, the perception network outputs the environmental feature vector to adjust the optimizer parameters to adapt to the dynamic environment. We train this fusion planner to imitate expert trajectories on a real self-driving dataset and demonstrate the planner’s explainability, data efficiency, and safe reactivity through closed-loop simulations. We also conduct experiments on a Unitree Go2 in three different scenarios to demonstrate the ability of our method to navigate in dynamic environments.