Image-Level Domain Alignment for Real-Time Underwater Crack Detection Using YOLO with an ROV
Pachelle Carelle Negue Kala, Christophe Viel, Lucia Bergantin
AI summary
Problem
Underwater infrastructure inspections currently rely on slow post-mission processing or risky diver methods, while existing deep learning models fail to generalize from aerial training data to challenging underwater conditions due to a significant domain gap.
Approach
The authors fine-tune a lightweight YOLO11n-seg model on a public aerial crack dataset and apply a non-learning CLAHE-based image enhancement pipeline to align underwater images with the training domain, deploying the quantized model on a Jetson Nano for real-time inference on an ROV.
Key results
- CLAHE enhancement boosts underwater detection recall from 0.326 to 0.716 and AP50 from 0.342 to 0.770
- Quantized YOLO11n-seg achieves real-time inference at approximately 5.3 ms per image on a Jetson Nano
- Successful field validation on a submerged concrete embankment in a high-turbidity lake environment
- Public release of a custom underwater crack detection validation dataset for future benchmarking
Why it matters
Enables safer, faster, and more cost-effective underwater infrastructure inspections by providing a deployable, real-time damage detection pipeline that bypasses the need for costly underwater data collection and annotation.
Abstract
Underwater concrete infrastructure plays a cru- cial role in energy and water systems. However, it requires regular inspections to ensure structural integrity. Remotely Operated Vehicles (ROVs) offer a safer and more cost-effective alternative to diver-based inspections. The data collected during inspections often require extensive post-mission processing, ei- ther manually or through computationally intensive algorithms. This limitation makes real-time damage detection during in- spections impossible. In this study, we present a real-time image-level domain alignment pipeline suitable for deployment on resource-constrained hardware. It combines image enhance- ment with crack detection using a YOLO11n-seg model fine- tuned on a publicly available aerial concrete crack dataset. The model was quantized and deployed on a Jetson Nano, which was connected to an ROV for real-time inference. To reduce the domain gap between the raw underwater images captured by the ROV and the aerial training data, a Contrast Limited Adaptive Histogram Equalization (CLAHE)-based strategy was applied. Field tests were conducted on a submerged concrete embankment in a turbid lake environment. A validation dataset was developed to evaluate performance offline and is publicly available.