SIGN: Safety-Aware Image-Goal Navigation for Autonomous Drones Via Reinforcement Learning
Zichen Yan, Rui Huang, Lei He, Shao Guo, Lin Zhao
AI summary
Problem
Existing image-goal navigation methods focus on ground robots or use discrete actions, failing to address the high-frequency control, localization drift, and safety requirements needed for autonomous drone flight in unknown environments.
Approach
SIGN trains an end-to-end continuous velocity policy using self-supervised auxiliary tasks for faster learning, while integrating a decoupled depth-based safety module that predicts collision risk and corrects unsafe actions in real time.
Key results
- State-of-the-art Gibson benchmark performance under continuous control
- Zero-shot sim-to-real transfer to physical drones without fine-tuning
- Real-time obstacle avoidance via depth-based collision prediction and action correction
- Map-less navigation using only RGB and IMU inputs
Why it matters
It provides a practical, low-cost navigation framework for drones operating in GPS-denied or unmapped settings like disaster response and industrial inspection.
Abstract
Image-goal navigation (ImageNav) tasks a robot with autonomously exploring an unknown environment and reaching a location that visually matches a given target image. While prior works primarily study ImageNav for ground robots, enabling this capability for autonomous drones is substantially more challenging due to their need for high-frequency feedback control and global localization for stable flight. In this paper, we propose a novel sim-to-real framework that leverages reinforce- ment learning (RL) to achieve ImageNav for drones. To enhance visual representation ability, our approach trains the vision backbone with auxiliary tasks, including image perturbations and future transition prediction, which results in more effective policy training. The proposed algorithm enables end-to-end ImageNav with direct velocity control, eliminating the need for external localization. Furthermore, we integrate a depth-based safety module for real-time obstacle avoidance, allowing the drone to safely navigate in cluttered environments. Unlike most existing drone navigation methods that focus solely on reference tracking or obstacle avoidance, our framework supports comprehensive navigation behaviors, including autonomous exploration, obstacle avoidance, and image-goal seeking, without requiring explicit global mapping. Code and model checkpoints are available at https://github.com/Zichen-Yan/SIGN.