← Back ICRA 2026

D2MFusion: An End-To-End Differentiable Trajectory Optimizer for Safe Reactive Navigation

Xiangyu Zhou, Shenghong Zhang, Xiao Li

PDF

AI summary

Key figure (auto-extracted from paper)

Fusing a neural perception network with a differentiable model-based optimizer enables safe, explainable, and data-efficient reactive navigation in dynamic environments.

Differentiable Planning Safe Navigation Model-Based Learning Reactive Trajectory Optimization End-to-End Learning Autonomous Robotics

Problem

Data-driven planners lack safety guarantees and require massive datasets, while traditional model-based planners struggle to adapt to dynamic environments. Existing hybrid approaches often lack deep integration, interpretability, and real-world validation.

Approach

D2MFusion extracts environmental features from BEV images using a CNN, refines them with an LSTM, and dynamically adjusts the parameters of a differentiable Linear Quadratic Regulator (dLQR) to generate trajectories in an end-to-end trainable loop.

Key results

Introduces D2MFusion, an end-to-end architecture fusing neural networks with dLQR for safe reactive navigation
Demonstrates planner explainability by visualizing how learned feature vectors adjust risk and tracking weights
Achieves high data efficiency and safe reactivity through imitation learning on the nuPlan dataset
Validates real-world dynamic navigation effectiveness on a Unitree Go2 robot across multiple scenarios

Why it matters

Provides a transparent, data-efficient planning framework that bridges the gap between learning-based adaptability and model-based safety for autonomous vehicles and robots.

Abstract

Data-driven methods provide effective solutions for robot trajectory generation in dynamic environments. Many physical constraints exist in the real world, and understanding these constraints to generate feasible trajectories for kinematics or dynamics is highly demanding regarding the data quantity. Due to the black box, it is also challenging to ensure the safety of the trajectories planned by data-driven models. In this paper, we propose an end-to-end model (D2MFusion) that fuses data-driven components and a model-based optimizer. D2MFusion uses a differentiable optimization layer (dLQR) that forms a backpropagation loop with a perception network. With the input BEV image, the perception network outputs the environmental feature vector to adjust the optimizer parameters to adapt to the dynamic environment. We train this fusion planner to imitate expert trajectories on a real self-driving dataset and demonstrate the planner’s explainability, data efficiency, and safe reactivity through closed-loop simulations. We also conduct experiments on a Unitree Go2 in three different scenarios to demonstrate the ability of our method to navigate in dynamic environments.

Index terms

Integrated Planning and Learning Autonomous Vehicle Navigation Motion and Path Planning