← Back ICRA 2026

Depth Completion by Rescaling Monocular Depth Estimates Via Compressed Sensing

Daoxin Zhong, Jun Li, Yeshas Thadimari, Meng Yee (Michael) Chuah

PDF

AI summary

Key figure (auto-extracted from paper)

A training-free compressed sensing pipeline rescales monocular depth estimates using sparse measurements, drastically improving accuracy while cutting GPU VRAM demands for robotics.

Depth completion compressed sensing monocular depth estimation rescaling robotics GPU efficiency

Problem

Existing learning-based depth completion models require additional parameters, domain-specific fine-tuning, and high GPU VRAM, making them impractical for resource-constrained robotic platforms.

Approach

The method computes a per-pixel scaling ratio matrix from sparse depth measurements using compressed sensing and a 2D DCT basis, then applies it to rescale any pre-trained monocular depth estimate without retraining.

Key results

Reduces RMSE and MAE by over 15x compared to raw monocular estimates
Outperforms state-of-the-art depth completion models at sampling ratios above 50%
Eliminates the need for model fine-tuning or additional training data
Lowers runtime GPU VRAM requirements compared to end-to-end neural networks

Why it matters

Enables accurate, real-time depth completion on power- and memory-limited robotic systems without the overhead of training complex neural networks.

Abstract

Depth completion is the challenge of recovering a dense depth map from an RGB image and corresponding sparse depth measurements. Many modern depth completion strategies often rely on deep neural networks, using a monocular depth estimation (MDE) backbone to generate an initial dense depth map from the RGB image. This estimate is then further refined with the help of auxiliary network components that utilise the sparse depth measurements to improve accuracy and restore fine-grained depth details. However, such approaches intro- duce additional model parameters and require domain-specific fine-tuning, making them impractical for resource-constrained robotics applications. In this paper, we propose an alternative refinement strategy based on compressed sensing. Using the Discrete Cosine Transform (DCT) as our basis, we construct a ratio matrix that rescales the estimated depth map to align with measured ground truth data. Our experiments demonstrate that this method can significantly reduce the RMSE and MAE of the initial MDE estimate by more than a factor of 15. Furthermore, the proposed approach can outperform state-of-the-art depth completion models at sampling ratios above 50 percent, while also reducing the overall GPU VRAM requirements. This pipeline is modular and compatible with any existing MDE model with no additional training, making it particularly suitable for deployment on GPU-constrained robotic platforms in previously unseen environments.

Index terms

RGB-D Perception Range Sensing