← Back ICRA 2026

Knowledge Optical to Sonar (KnOTS): Towards the Transfer of Knowledge of Underwater Object Detection from Optical to Forward-Looking Sonar Imagery

Caroline Keenan, Ella Wawrzynek, David Whelihan, Ivy Mahncke, John Leonard, Madeline Miller

PDF

AI summary

Key figure (auto-extracted from paper)

KnOTS automatically transfers object detection knowledge from optical to forward-looking sonar images, eliminating the need for manually labeled sonar data.

forward-looking sonar object detection knowledge transfer autonomous underwater vehicles YOLO automated labeling

Problem

Underwater object detection for autonomous vehicles is hindered by the scarcity of labeled sonar data and the environmental limitations of optical cameras. Manual annotation of sonar imagery is labor-intensive and requires specialized expertise, creating a bottleneck for model development.

Approach

The method co-mounts an optical camera and forward-looking sonar on an AUV to capture synchronized imagery. It trains a vision model on optical images, extracts object azimuth boundaries, and maps them to preprocessed sonar data using connected component analysis to automatically generate bounding box labels.

Key results

0.985 mAP50 on optical images with minimal training data
Automatic generation of sonar bounding boxes without manual labeling
Successful YOLOv11 training on automatically labeled sonar imagery
Real-time processing at 12 image pairs per second on an embedded AUV

Why it matters

Provides a scalable, annotation-free pipeline for training robust underwater object detectors, accelerating AUV navigation and search capabilities.

Abstract

We develop an approach to detect objects in forward-looking sonar (FLS) images using corresponding opti- cal images and without the need for expert manual labeling of sonar images. Sonar sensing is more robust to disadvantageous underwater environmental conditions than optical sensing, but the scarcity of labeled sonar data leads to decreased perfor- mance of methods which rely on an abundance of training data. We aim to transfer insights from data-rich applications such as object detection in optical imaging to the data-scarce area of object detection in sonar images. Our approach in- volves recording of contemporaneous images from commercially available sensors viable for use aboard unmanned underwater vehicles. We collect new optical and sonar data in a shallow, clear-water environment and employ existing object detection techniques for optical images. We leverage the commonality of the sensors’ fields of view and our algorithmic processing of the sonar image to transfer knowledge of object bounding boxes to sonar images to create a dataset. Through this transfer, we enable training of a model that detects objects in unseen sonar images and does not require optical images as input at test time.

Index terms

Marine Robotics Sensor Fusion Deep Learning for Visual Perception