Research Analyzer
← Back ICRA 2026

ProbeMDE: Uncertainty-Guided Active Proprioception for Monocular Depth Estimation in Surgical Robotics

Britton Jordan, Jordan Thompson, Jesse F. d'Almeida, Hao Li, Nithesh Kumar, Susheela Sharma Stern, James Ferguson, Ipek Oguz, Robert James Webster III, Daniel Brown, Alan Kuntz

PDF

AI summary

Key figure (auto-extracted from paper)
ProbeMDE uses uncertainty-guided active touch sensing to significantly improve monocular depth estimation accuracy in surgical robotics while minimizing physical measurements.
Monocular depth estimation active sensing proprioception uncertainty quantification surgical robotics Stein Variational Gradient Descent

Problem

Monocular depth estimation fails in challenging surgical environments due to textureless surfaces and occlusions, yet actively gathering sparse depth measurements via robot touch is costly and unoptimized.

Approach

The framework trains an ensemble of depth models to quantify predictive uncertainty, then uses Stein Variational Gradient Descent on the uncertainty gradient to optimally select diverse, informative points for the robot to physically probe.

Key results

  • Outperforms baselines across standard depth metrics in simulation
  • Achieves higher accuracy with fewer measurements in physical phantom trials
  • First application of proprioception to improve endoscopic monocular depth estimation
  • Prevents mode collapse during active sensing via uncertainty gradient optimization

Why it matters

Provides a cost-effective, uncertainty-driven sensing strategy that enhances depth perception reliability for minimally invasive surgical robots navigating complex anatomical structures.

Abstract

Monocular depth estimation (MDE) provides a useful tool for robotic perception, but its predictions are often uncertain and inaccurate in challenging environments such as surgical scenes where textureless surfaces, specular reflections, and occlusions are common. To address this, we propose ProbeMDE, a cost-aware active sensing framework that combines RGB images with sparse proprioceptive mea- surements for MDE. Our approach utilizes an ensemble of MDE models to predict dense depth maps conditioned on both RGB images and a sparse set of known depth measurements obtained via proprioception, where the robot has touched the environment in a known configuration. We quantify predictive uncertainty via the ensemble’s variance and measure the gradient of the uncertainty with respect to candidate mea- surement locations. To prevent mode collapse while selecting maximally informative locations to propriocept (touch), we leverage Stein Variational Gradient Descent (SVGD) over this gradient map. We validate our method in both simulated and physical experiments on central airway obstruction surgical phantoms. Our results demonstrate that our approach outper- forms baseline methods across standard depth estimation met- rics, achieving higher accuracy while minimizing the number of required proprioceptive measurements. The project website is brittonjordan.github.io/probe mde

Index terms

Computer Vision for Medical Robotics Medical Robots and Systems Deep Learning for Visual Perception

Related papers