← Back ICRA 2026

WideDepth: Millimeter-Accurate Benchmark for Fisheye Depth Estimation

Ilia Indyk, Ignat Penshin, Ivan Sosin, Maxim Monastyrny, Aleksei Valenkov, Ilya Makarov

PDF

AI summary

Key figure (auto-extracted from paper)

WideDepth delivers the first millimeter-accurate indoor fisheye depth benchmark and proves that fine-tuning standard pinhole models on it yields up to a 62% performance boost.

fisheye depth estimation indoor benchmark LiDAR point clouds stereo matching equirectangular projection robotics perception

Problem

Fisheye cameras are vital for robotics and AR but lack accurate indoor depth benchmarks due to nonlinear geometry and information loss, leaving existing datasets synthetic, outdoor, or imprecise.

Approach

The authors project high-resolution LiDAR point clouds into virtual fisheye and pinhole stereo pairs using a Double Sphere camera model, enabling scalable dataset generation and novel disparity-depth conversions for equirectangular projections.

Key results

First indoor millimeter-accurate fisheye depth benchmark with 5K high-resolution stereo pairs across 101 scenes
CUDA-accelerated pipeline generating fisheye stereo pairs and ground truth directly from LiDAR scans
Novel spherical geometry method for converting disparity to depth in vertical equirectangular stereo pairs
Equirectangular projection outperforms cubemap in stereo matching accuracy and fine-tuning boosts performance by up to 62%

Why it matters

Provides a critical, high-precision resource for advancing robotics perception, AR, and autonomous systems that rely on ultra-wide field-of-view cameras.

Abstract

Fisheye cameras are increasingly adopted in robotics for near-field manipulation, navigation, and immersive perception, yet indoor depth benchmarks with accurate ground truth are still missing. To address this, we introduce WideDepth — the first indoor dataset for fisheye depth estimation, featuring 101 scenes containing 5K high-resolution stereo pairs labeled with millimeter-level ground truth depth and disparity. Our dataset also includes paired pinhole and fisheye samples across varying fields of view and baselines in both horizontal and vertical stereo setups. We further propose a method to adapt pinhole-trained stereo models to fisheye images and introduce a novel stereo fisheye image generation pipeline based on high-resolution LiDAR scans. Leveraging these methods, we thoroughly evaluate state-of-the-art monocular depth, stereo matching, and depth completion models on our benchmark. Additionally, we provide 18K LiDAR-derived sparse depth training samples, achieving up to a 62% performance boost on fisheye data when fine-tuning pinhole-based stereo models. In summary, the high precision and versatility of our benchmark set a strong foundation for advancing research in fisheye depth estimation and robotics perception. Project page: ilyaind.github.io/WideDepth

Index terms

Data Sets for Robotic Vision RGB-D Perception Omnidirectional Vision