← Back ICRA 2026

VDS-Nav: Volumetric Depth-Based Safe Navigation for Aerial Robots�Bridging the Sim-To-Real Gap

Van Huyen Dang, Adrian Redder, Huy Pham, Andriy Sarabakha, Erdal Kayacan

PDF

AI summary

Key figure (auto-extracted from paper)

VDS-Nav enables reliable sim-to-real transfer for aerial robots by training a navigation policy directly on raw depth sequences with a novel depth-based reward design.

Aerial robots Vision-based navigation Reinforcement learning Sim-to-real transfer Depth-based reward End-to-end control

Problem

Vision-based end-to-end navigation for aerial robots struggles with the sim-to-real gap, often relying on latent space encoders that lose information or using reward functions that poorly correlate with raw sensor data.

Approach

The authors propose VDS-Nav, an end-to-end deep reinforcement learning policy that maps sequences of raw depth images directly to velocity and yaw commands, using a novel reward function based on the dot product between velocity and depth pixel vectors to enforce safety constraints.

Key results

Outperforms latent-space baseline in simulation success rate
Achieves real-world deployment with performance closely matching simulation
Enables zero-shot sim-to-real transfer without information loss
Validates improved learning through volumetric depth sequence inputs

Why it matters

Provides a robust, deployable navigation framework for resource-constrained aerial robots operating in cluttered environments, advancing practical vision-based autonomy.

Abstract

End-to-end navigation via deep reinforcement learn- ing has become a key approach for vision-based tasks. However, the sim-to-real gap remains a challenge, especially for aerial robots, where policies trained in simulation often fail in real- world environments. In this work, we propose a novel navigation paradigm – volumetric depth-based safe navigation (VDS-Nav), which trains a policy to infer linear velocities and yaw rate directly from a sequence of depth images, bypassing the need for a pre-trained latent space encoder. We enhance safety with a depth-based reward design, enabling the seamless incorporation of system constraints via logarithmic barrier function methods. Most importantly, using explicit sensor information in our reward design leads to seamless sim-to-real transfer by strengthening the correlation between state-action pairs and received rewards. To evaluate the effectiveness of VDS-Nav, we compare it to a baseline that first trains a variational autoencoder to encode depth images into a latent space for policy training. The simulation results show that VDS-Nav outperforms the baseline in terms of success rate. Furthermore, real-world experiments validate the policy, with real-time performance closely matching simulation results, suggesting an effective sim-to-real transfer.

Index terms

Vision-Based Navigation Aerial Systems: Applications Reinforcement Learning