← Back ICRA 2026

A Self-Supervised Learning Approach with Differentiable Optimization for UAV Trajectory Planning

Yufei Jiang, Yuanzhu Zhan, Harsh vardhan Gupta, Chinmay Mahendra Borde, Junyi Geng

PDF

AI summary

Key figure (auto-extracted from paper)

Integrates self-supervised depth perception with differentiable trajectory optimization to cut UAV control effort by 30.90% while guaranteeing dynamic feasibility.

UAV trajectory planning self-supervised learning differentiable optimization 3D cost map minimum snap neural time allocation

Problem

Traditional modular UAV path planning suffers from latency and suboptimal performance, while end-to-end learning methods lack dynamical feasibility, require large datasets, and struggle with sim-to-real transfer.

Approach

A self-supervised pipeline that jointly trains a depth perception network and a differentiable minimum-snap trajectory optimizer using a 3D cost map for collision guidance and a neural time-allocation network.

Key results

Self-supervised 3D path planning pipeline eliminating need for expert labels
30.90% reduction in control effort with competitive tracking performance
Differentiable minimum-snap optimizer guaranteeing dynamic feasibility
Neural time-allocation network enhancing planning efficiency and optimality

Why it matters

Provides a robust, interpretable, and data-efficient navigation framework for UAVs operating in complex 3D environments under strict SWAP constraints.

Abstract

While Unmanned Aerial Vehicles (UAVs) have gained significant traction across various fields, path planning in 3D environments remains a critical challenge, particularly under size, weight, and power (SWAP) constraints. Traditional modular planning systems often introduce latency and subopti- mal performance due to limited information sharing and local minima issues. End-to-end learning approaches streamline the pipeline by mapping sensory observations directly to actions but require large-scale datasets, face significant sim-to-real gaps, or lack dynamical feasibility. In this paper, we propose a self- supervised UAV trajectory planning pipeline that integrates a learning-based depth perception with differentiable trajectory optimization. A 3D cost map guides UAV behavior without expert demonstrations or human labels. Additionally, we in- corporate a neural network-based time allocation strategy to improve the efficiency and optimality. The system thus combines robust learning-based perception with reliable physics-based optimization for improved generalizability and interpretabil- ity. Both simulation and real-world experiments validate our approach across various environments, demonstrating its effec- tiveness and robustness. Our method achieves a 30.90% reduc- tion in control effort while maintaining competitive tracking performance compared with state-of-the-art.

Index terms

Aerial Systems: Applications Motion and Path Planning Collision Avoidance