OASIS-DC: Generalizable Depth Completion Via Output-Level Alignment of Sparse-Integrated Monocular Pseudo Depth
Jaehyeon Cho, Jhonghyun An
AI summary
Problem
Monocular foundation models output relative depth, while traditional completion methods require large labeled datasets and extensive validation curation, hindering deployment in dynamic environments.
Approach
The method aligns a frozen monocular depth estimator’s output with sparse LiDAR measurements using a non-learned Poisson formulation to create a calibrated pseudo-depth prior, which a lightweight network then corrects via localized residual refinement.
Key results
- Non-learned Poisson fusion aligns frozen monocular depth outputs with sparse LiDAR anchors
- Lightweight residual network corrects local errors while preserving global metric scale
- Achieves top-tier few-shot accuracy on KITTI-DC and NYUv2 benchmarks
- Sustains stable depth and sharp edges under strict, deployment-oriented data scarcity
Why it matters
It offers a computationally efficient, deployment-ready solution for robotics and autonomous driving systems operating under real-world label scarcity.
Abstract
Recent monocular foundation models excel at zero-shot depth estimation, yet their outputs are inherently relative rather than metric, limiting direct use in robotics and autonomous driving. We leverage the fact that relative depth preserves global layout and boundaries: by calibrating it with sparse range measurements, we transform it into a pseudo met- ric depth prior. Building on this prior, we design a refinement network that follows the prior where reliable and deviates where necessary, enabling accurate metric predictions from very few labeled samples. The resulting system is particularly effective when curated validation data are unavailable, sustaining stable scale and sharp edges across few-shot regimes. These findings suggest that coupling foundation priors with sparse anchors is a practical route to robust, deployment-ready depth completion under real-world label scarcity.