← Back ICRA 2026

Annotation Free Spacecraft Detection and Segmentation Using Vision Language Models

Samet Hicsonmez, Jose Sosa, Dan Pineau, INDER PAL SINGH, Arunkumar Rathinam, Abd El Rahman Shabayek, Djamila Aouada

PDF

AI summary

Key figure (auto-extracted from paper)

A lightweight student model distilled from refined VLM pseudo-labels significantly boosts zero-shot spacecraft detection and segmentation accuracy without manual annotations.

Spacecraft detection Vision Language Models Annotation-free learning Knowledge distillation Space situational awareness Zero-shot segmentation

Problem

Manual annotation for spacecraft detection is costly and error-prone due to poor visibility and complex backgrounds, while models trained on synthetic data suffer from domain gaps.

Approach

The pipeline automatically generates pseudo-labels from unlabeled real images using a pre-trained Vision Language Model, refines them with test-time augmentation and weighted box fusion, and distills them into a compact student model via iterative knowledge distillation.

Key results

Up to 10-point average precision gains over direct zero-shot VLM inference
Eliminates reliance on extensive manual labeling for training
Produces lightweight, real-time capable models for in-orbit deployment
First framework to fully exploit VLM zero-shot capabilities for spacecraft segmentation

Why it matters

Provides a scalable, annotation-free solution for rapid deployment of robust spacecraft tracking and debris monitoring systems in space situational awareness.

Abstract

Vision Language Models (VLMs) have demon- strated remarkable performance in open-world zero-shot visual recognition. However, their potential in space-related appli- cations remains largely unexplored. In the space domain, accurate manual annotation is particularly challenging due to factors such as low visibility, illumination variations, and object blending with planetary backgrounds. Developing methods that can detect and segment spacecraft and orbital targets without requiring extensive manual labeling is therefore of critical importance. In this work, we propose an annotation- free detection and segmentation pipeline for space targets using VLMs. Our approach begins by automatically generating pseudo-labels for a small subset of unlabeled real data with a pre-trained VLM. These pseudo-labels are then leveraged in a teacher-student label distillation framework to train lightweight models. Despite the inherent noise in the pseudo- labels, the distillation process leads to substantial performance gains over direct zero-shot VLM inference. Experimental eval- uations on the SPARK-2024, SPEED+, and TANGO datasets on segmentation tasks demonstrate consistent improvements in average precision (AP) by up to 10 points. Code and mod- els are available at https://github.com/giddyyupp/ annotation-free-spacecraft-segmentation.

Index terms

Space Robotics and Automation Aerial Systems: Perception and Autonomy Deep Learning for Visual Perception