Research Analyzer
← Back ICRA 2026

ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation Via Scene-Aware Control Barrier Functions

Sourav Sanyal, Kaushik Roy

PDF

AI summary

Key figure (auto-extracted from paper)
Integrating scene-aware control barrier functions into vision-language navigation increases drone success rates by up to 67% while maintaining safe, real-time obstacle avoidance.
vision-language navigation control barrier functions drone safety model predictive control adaptive safety margin robotic autonomy

Problem

Vision-language navigation models lack formal safety guarantees for physical agents operating in dynamic environments with moving obstacles, risking collisions and mission failure.

Approach

The authors propose ASMA, which fuses a CLIP-YOLO vision-language encoder with a scene-aware Control Barrier Function embedded in a Model Predictive Control loop to dynamically adjust drone trajectories and enforce safety constraints in real time.

Key results

  • 64%–67% increase in navigation success rates over baseline
  • Minimal trajectory length overhead (1.4%–5.8%)
  • Real-time dynamic obstacle tracking and on-the-fly safety constraint generation
  • Validated full-stack ROS-Gazebo deployment on a Parrot Bebop2 quadrotor

Why it matters

Provides a deployable safety layer for language-guided drones, advancing reliable autonomous operations in dynamic, human-centric environments.

Abstract

In the rapidly evolving field of vision–language navi- gation (VLN), ensuring safety for physical agents remains an open challenge. For a human-in-the-loop language-operated drone to navigate safely, it must understand natural language commands, perceive the environment, and simultaneously avoid hazards in real time. Control Barrier Functions (CBFs) are formal methods that enforce safe operating conditions. Model Predictive Control (MPC) is an optimization framework that plans a sequence of future actions over a prediction horizon, ensuring smooth tra- jectory tracking while obeying constraints. In this work, we con- sider a VLN-operated drone platform and enhance its safety by formulating a novel scene-aware CBF that leverages ego-centric observations from a camera which has both Red-Green-Blue as well as Depth (RGB-D) channels. A CBF-less baseline system uses a Vision–Language Encoder with cross–modal attention to convert commandsintoanorderedsequenceoflandmarks.Anobjectdetec- tion model identifies and verifies these landmarks in the captured images to generate a planned path. To further enhance safety, an Adaptive Safety Margin Algorithm (ASMA) is proposed. ASMA tracks moving objects and performs scene-aware CBF evaluation on-the-fly, which serves as an additional constraint within the MPC framework. By continuously identifying potentially risky observa- tions, the system performs prediction in real time about unsafe conditions and proactively adjusts its control actions to maintain safe navigation throughout the trajectory. Deployed on a Parrot Bebop2 quadrotor in the Gazebo environment using the Robot Operating System (ROS), ASMA achieves 64%–67% increase in success rates with only a slight increase (1.4%–5.8%) in trajectory lengths compared to the baseline CBF-less VLN.

Index terms

Sensor-based Control Collision Avoidance AI-Enabled Robotics

Related papers