MV3D: Multi-View 3D Reconstruction of Objects Using Forward-Looking Sonar
Nael Jaber, Bilal Wehbe, Leif Christensen, Frank Kirchner
AI summary
Problem
Forward-looking sonars output 2D images that lack elevation data, making 3D reconstruction difficult due to ambiguous 2D-to-3D correspondences and a scarcity of real-world training datasets.
Approach
An encoder-decoder network extracts features from a batch of 24 sonar images to predict eight multi-view depth maps, which are converted into a complete 3D point cloud, while a Cycle-GAN adapts synthetic training data to match real acoustic styles.
Key results
- Predicts multi-view depth maps from a linear scan batch
- Achieves accurate 3D reconstruction across basic and complex geometries in simulation
- Validates in real underwater environments with 0.06m average chamfer distance error
- Bridges simulation-to-real gap via Cycle-GAN style transfer
Why it matters
Provides a practical, high-accuracy 3D mapping solution for underwater vehicles operating in turbid or dark waters where optical sensors fail.
Abstract
This work proposes a method for learning features from a batch of 2D sonar images to predict a multi-view point- cloud for achieving a dense 3D-reconstruction. In comparison to vision-based sensors, acoustics are considered a reliable sensing modality in underwater environments. The output of sonars is a 2D image which is unable to represent the scanned scene in all three dimensions. Estimation of this missing information, known as the elevation angle, is the key to performing 3d-reconstruction from acoustic images. One of the approaches is to predict a depth-map from the 2D sonar image, and transforming it into a point-cloud. In this letter, this idea is further improved into learning features from a batch of 2D acoustic images and predicting multiple depthmaps of the scanned object which covers it from different viewpoints. For training the deep learning model, and due to the lack of datasets from real environments, data was generated synthetically. For reducing the simulation-to-real gap, a Cycle-GAN was trained on real images for transferring the realistic style into the syntheti- cally generated images. The conducted experiments in simulation showed that the proposed method is able to perform dense 3D reconstruction. The approach was then further tested in a real environment using an underwater vehicle, which accurately 3D- reconstructed the scanned objects achieving an average chamfer distance error of 0.06 meters when compared to a laser-scanned ground-truth.