← Back ICRA 2026

SonarSweep: Fusing Sonar and Vision for Robust 3D Reconstruction Via Plane Sweeping

Lingpeng Chen, Jiakun Tang, Pui Yi Chui, Junfeng Wu, Ziyang Hong

PDF

AI summary

Key figure (auto-extracted from paper)

SonarSweep enables accurate, dense 3D depth estimation in turbid underwater environments by adaptively fusing sonar and monocular vision through a differentiable plane-sweeping framework.

Underwater 3D reconstruction sonar-vision fusion plane sweeping depth estimation autonomous underwater vehicles deep learning

Problem

Single-modality 3D reconstruction fails underwater due to poor visibility for vision and elevation ambiguity for sonar, while existing fusion methods rely on flawed geometric assumptions or heavy computation.

Approach

The method adapts the classic plane sweep algorithm into an end-to-end deep learning pipeline that differentially warps sonar features onto hypothesized 3D planes aligned with sonar geometry, fusing them with camera features to regress a dense depth map.

Key results

First adaptation of deep plane sweep to cross-modal sonar-vision fusion
Significantly outperforms state-of-the-art vision, sonar, and heuristic baselines
Validated across high-fidelity simulation and real-world water tank experiments
Public release of the first synchronized stereo-camera and imaging sonar dataset

Why it matters

Provides a robust, real-time perception solution for autonomous underwater vehicles operating in visually degraded environments, advancing underwater robotics and marine exploration.

Abstract

Accurate 3D reconstruction in visually-degraded underwater environments remains a formidable challenge. Single-modality approaches are insufficient: vision-based meth- ods fail due to poor visibility and geometric constraints, while sonar is crippled by inherent elevation ambiguity and low resolution. Consequently, prior fusion techniques rely on heuristics and flawed geometric assumptions, leading to significant artifacts and an inability to model complex scenes. In this paper, we introduce SonarSweep, a novel end-to-end deep learning framework that overcomes these limitations by adapting the principled plane sweep algorithm for cross-modal fusion between sonar and visual data. Extensive experiments in both high-fidelity simulation and real-world environments demonstrate that SonarSweep consistently generates dense and accurate depth maps, significantly outperforming state-of-the- art methods under challenging conditions, particularly in high turbidity. To foster further research, we publicly release our code and a novel dataset featuring synchronized stereo-camera and sonar data—the first of its kind—at https://github. com/LIAS-CUHKSZ/SonarSweep.

Index terms

Marine Robotics Sensor Fusion Deep Learning for Visual Perception